Exporting Mailboxes Larger Than 2 GB On An Exchange Server

At one point, the MAPI used by Exchange was the same as the MAPI used by Outlook. But many years ago (literally - pre-Exchange 5.5) the MAPI used by Exchange server began to diverge from the MAPI used by Outlook. This isn't particularly surprising, as the needs of a MAPI server are the inverse to the needs of a MAPI client. By Outlook 2003/Exchange 2003, a significant item was that client-MAPI (the MAPI used by Outlook) supports Unicode PSTs. Server-MAPI (the MAPI used by Exchange) only supports ANSI PSTs.

While there are MANY under-the-hood differences between the two types of PSTs, the key issue for most people is that ANSI PSTs are limited to 2 GB in size (the actual limit is about 1.8 GB of data, but this leads to a file size of just about 2 GB). Unicode PSTs do not have that limitation and can of any "reasonable" size. (They are limited by default to 20 GB, but can grow beyond that by adding a registry key for Outlook's MAPI.)

This leads to a challenge on Exchange 2003 or Exchange 2007 servers when using ExMerge (yes, yes, ExMerge isn't officially supported against Exchange 2007 but it works just fine). ExMerge can only use server MAPI. However, mailboxes may be larger than 2 GB. So what do you do?

Glen Scales, an Exchange MVP with a developer bent, developed a script in early 2007 to address this problem. Glen's original script is here.

I've recently been working on a project for a large company and we needed to do this export against thousands of mailboxes. I started with Glen's script and ran into a few issues, so I've more-or-less rewritten it; but the basic concept is the same - scan a mailbox on an Exchange server and break it into chunks. Each chunk will not be larger than 1.8 GB and each chunk will not contain any folder that contains more than 16,300 items (16K items per folder was another limit of ANSI PSTs).

I give great thanks to Glen for his original script, without his script this project would've been much harder.

If you actually want to know how the script works - I refer you to Glen's original blog on the topic. The mechanism has not changed.

Without further ado...

''
'' ExMBspanPst.vbs
''
'' Based on a script from Glen Scales
'' http://gsexdev.blogspot.com/2007/01/exporting-mailbox-larger-then-2-gb-and.html
''
'' Requires Outlook Redemption, but not Outlook
'' http://www.dimastr.com/redemption
''
'' Fixes a few bugs:
''	orig. script didn't split at 16K messages in a folder
''	orig. script didn't report progress in 2, 3, ... n PSTs
''	orig. script could create two copies of a message in output PST
''	orig. script didn't send all status output to output file
''	orig. script didn't check for the presence of existing PST
'' Adds a feature or two:
''	accepts input mailbox as parameter
''	a number of stability improvements (error checks)
''	added "option explicit" and updated code for support of same
''	copies HiddenItems (Associated Items) and DeletedItems as well as normal items
'' Almost a full source reformat (so I could understand the code better)
'' Removed a fair bit of unused code (although I may have added more of my own)
'' Release resources whenever possible
'' Use RDO for all things, don't fall back to CDO
''
'' Update published with permission of Glen.
''
'' Michael B. Smith
'' The Essential Exchange
'' michael@TheEssentialExchange.com
''
Option Explicit

Dim mbMailbox      '' name of the mailbox (Exchange alias/mailNickname works best)
Dim servername     '' name of the Exchange server hosting the mailbox
Dim bfbaseFilename '' prefix used to name the new PST
Dim pfFilePath     '' directory in which to store PSTs

mbMailbox = WScript.Arguments(0)
''
'' these should be the only values you need to change
''
servername = "exchserver"
bfBaseFilename = "set1-" & mbMailbox
pfFilePath = "c:\temp\"
''
'' end change area
''

Dim fnFileName '' name of the output PST (set by CreatenewPst; uses pfFilePath, bfBasefileName and mbMailbox)
Dim fNumber    '' index of the output PST (will be updated to start at 1 by CreateNewPst)

fnFileName = ""
fNumber = 0

Dim doDictionaryObject '' scripting.dictionary, contains list of entry-ids present in current PST
Dim fso                '' scripting.filesystemobject
Dim RDOSession         '' redemption.rdosession

Set doDictionaryObject = CreateObject("Scripting.Dictionary")
Set fso                = CreateObject("Scripting.FileSystemObject")
set RDOSession         = CreateObject("Redemption.RDOSession")

Dim tsize       '' the next time I report the size of the new PST (that is, it's calculated size)
Dim tnThreshold '' maximum size (in MB) of a PST, before I switch to a new one

tsize = 10
tnThreshold = 1800

Dim PST
Dim IPMRoot
Dim pfPstFile            '' object for the new PST
Dim PstRootFolder        '' object pointing to the root of the current PST

PST           = Empty    '' PST is the Redemption pointer to the PST
IPMRoot       = Empty    '' IPMRoot is the root of the IPM subtree in the mailbox
pfPstFile     = Empty    '' fso.GetFile(fnFileName) returns the object for this file

PstRootFolder = Empty    '' This variable never actually gets set, but removing it would've
                         '' called for refactoring too much code - when the code is fixed
                         '' to set this value properly, other stuff breaks. That's why the
                         '' return values are commented out in ProcessFolder[Root | Sub].

Dim wfile                '' file we write to for informational messasges
Dim dfDeletedItemsFolder '' the deleted items folder in the current input mailbox
Dim miLoop               '' used for looping through IPMRoot.Folders
Dim fld                  '' used for looping through IPMRoot.Folders
Dim iMessageCount        '' total number of messages processed

iMessageCount = 0

	''
	'' MAIN code
	''

	On Error Resume Next
	Set wfile = fso.opentextfile(pfFilePath & bfBaseFilename & ".txt", 2, true)
	If Err Then
		WScript.Echo "Main: Error: Could not open " & pfFilePath & bfBaseFilename & ".txt"
		WScript.Quit 1
	End If
	On Error Goto 0

	msg "Main: debug output text file is " & pfFilePath & bfBaseFilename & ".txt"
	msg "Main: will attempt login to mailbox " & mbMailbox & " on server " & servername

	RDOSession.LogonExchangeMailbox mbMailbox, servername
	Set dfDeletedItemsFolder = RDOSession.GetDefaultFolder(3)
	Call CreateNewPst

	msg "Main: Enumerating Mailbox " & wscript.arguments(0)

	For miLoop = 1 to IPMRoot.Folders.Count
		Set fld = IPMRoot.Folders(miLoop)
		Call ProcessItems(fld)
		If fld.Folders.count > 0 then
			msg "Main: Calling Enumfolders for " & fld.Name
			Call Enumfolders(fld, PstRootFolder, 2)
		End if
		Set fld = Nothing
	Next

	msg "Main: A total of " & iMessageCount & " messages were processed."
	msg "Main: Done"

	'' clean up and release resources
	Set dfDeletedItemsFolder = Nothing
	RDOSession.Logoff
	wfile.Close
	Set wfile      = Nothing
	Set RDOSession = Nothing
	Set fso        = Nothing

Sub msg(ByVal str)
	WScript.Echo str
	wfile.WriteLine(str)
End Sub

Function Enumfolders(FLDS, RootFolder, ltype)
	''
	'' The current folder in the source mailbox is FLDS
	'' RootFolder should be the parent folder of the current folder
	''
	'' If ltype == 2, then process the non-folder items in the current folder (i.e., messages)
	'' If ltype == 1, then process the sub-folders in the current folder
	''
	Dim fl  '' used for looping through FLDS.Folders
	Dim fld '' used for looping through FLDS.Folders

	For fl = 1 to FLDS.Folders.count
		Set fld = FLDS.Folders(fl)
		If ltype = 1 then
			Call ProcessFolderSub(fld, RootFolder)
		Else
			Call ProcessItems(fld)
		End If

		msg "Enumfolders: " & fld.Name

		If fld.Folders.Count <> 0 then
			Call Enumfolders(fld, fld.EntryID, ltype)
		End if
		Set fld = Nothing
	Next
End function

Function CreateNewPst
	''
	'' conceivably, we should check ERR.number for almost every statement in this routine
	'' realistically, that would make the code almost unreadable and incomprehensible
	''
	Dim pstfld '' used for looping through PstRoot.Folders
	Dim fiLoop '' used for looping through IPMRoot.Folders
	Dim fld    '' used for looping through IPMRoot.Folders

	doDictionaryObject.RemoveAll
	fNumber = fNumber + 1
	fnFileName = pfFilePath & bfBaseFilename & "-" & fNumber & ".pst"

	msg "CreateNewPst: About to create new PST named " & fnFileName

	If fso.FileExists(fnFileName) Then
		msg "CreateNewPst: Error: PST already exists: " & fnFileName
		WScript.Quit 1
	End If

	If Not IsEmpty(PST) Then
		Set PST = Nothing
	End If
	Set PST = RDOSession.Stores.AddPSTStore(fnFileName, 1,  "Exported MailBox-" & now())

	If fnumber = 1 Then
		Dim pstroot

		Set pstroot = RDOSession.GetFolderFromID(PST.IPMRootFolder.EntryID, PST.EntryID)
		For Each pstfld In PstRoot.folders
			If pstfld.Name = "Deleted Items" Then
				doDictionaryObject.add dfDeletedItemsFolder.EntryID, pstfld.EntryID
				msg "CreateNewPst: Added Deleted Items Folder to dictionary"
				Exit For
			End If
		Next
		Set pstroot = Nothing
	End If

	If Not IsEmpty(IPMRoot) Then
		Set IPMRoot = Nothing
	End If
	Set IPMRoot = RDOSession.Stores.DefaultStore.IPMRootFolder

	msg "CreateNewPST: processing each new default folder in new PST"
	For fiLoop = 1 to IPMRoot.Folders.count
		Set fld = IPMRoot.Folders(fiLoop)
		If fld.Name <> "Deleted Items" Then
			PstRootFolder = ProcessFolderRoot(fld, PST.IPMRootFolder.EntryID)
		End If
		If fld.Folders.count > 0 Then
			Call Enumfolders(fld, fld.EntryID, 1)
		End If
		Set fld = Nothing
	Next

	If Not IsEmpty(pfPstFile) Then
		Set pfPstFile = Nothing
	End If
	Set pfPstFile = fso.GetFile(fnFileName)

	tsize = 10 '' back at the beginning now

	msg "CreateNewPst: Created new PST named: " & fnFileName
End Function

Function ProcessFolderRoot(Fld, parentfld)
	Dim newFolder '' next folder to be examined
	Dim CDOPstFld '' a particular folder parent in the PST based on the entryid of the PST

	msg "ProcessFolderRoot: " & fld.Name

	Set CDOPstfld = RDOSession.GetFolderFromID(parentfld, PST.EntryID)
	Set newFolder = CDOPstfld.Folders.ADD(Fld.Name)	
	'''ProcessFolderRoot = newFolder.EntryID
	newfolder.fields(&H3613001E) = Fld.fields(&H3613001E)

	doDictionaryObject.add Fld.EntryID, newfolder.EntryID

	Set newFolder = Nothing
	Set CDOPstfld = Nothing
End Function

Function ProcessFolderSub(Fld, parentfld)
	Dim newFolder '' next folder to be examined
	Dim CDOPstFld '' a particular folder parent in the PST based on the entryid of the PST

	msg "ProcessFolderSub: " & fld.Name

	Set CDOPstfld = RDOSession.GetFolderFromID(doDictionaryObject.item(parentfld), PST.EntryID)
	Set newFolder = CDOPstfld.Folders.ADD(Fld.Name)	
	'''ProcessFolderSub = newFolder.EntryID
	newfolder.fields(&H3613001E) = Fld.fields(&H3613001E)

	doDictionaryObject.add Fld.EntryID, newfolder.EntryID

	Set newFolder = Nothing
	Set CDOPstfld = Nothing
End Function

Sub ReportError(prefix, Fld, item, txt)
	msg prefix & " " & "Error Processing Item #" & item & " in " & Fld.Name & " " & txt
	msg prefix & " " & "EntryID of Item: " & Fld.items(item).EntryID
	msg prefix & " " & "Subject of Item: " & Fld.items(item).Subject
End Sub

Function CalcNewSize(pstFile, item)
	''
	'' calculate what the new physical size of the pstFile will be after adding the next item
	'' to it. do so safely, avoiding all possible faults, and return the value in megabytes,
	'' rounded up.
	''
	Dim pstSize, itemSize, totalSize

	On Error Resume Next
	pstSize = pstFile.Size
	If Err.Number Then
		pstSize = 1048576 '' assume 1 MB for the heck of it
	End If

	Err.Clear
	itemSize = item.Size
	If Err.Number Then
		itemSize = 1048576 '' assume 1 MB for the heck of it
	End If

	Err.Clear
	totalSize = Int ((pstsize + itemSize) / 1048576) + 1
	If Err.Number Then
		totalSize = 3
	End If
	On Error Goto 0

	CalcNewSize = totalSize
End Function	

Sub ProcessItems(Fld)
	Dim strType             '' the IPM type of the input folder
	Dim fiItemLoop          '' used to loop through the input folder
	Dim fiCDOcount          '' how many messages CDO told us to expect
	Dim pfPredictednewSize  '' predicted size of the output PST after the next message is written
	Dim dfDestinationFolder '' output folder in the current output PST
	Dim objMessages         '' collection of messages contained by the source folder
	Dim objMessage	        '' current message of interest from the source folder
	Dim srcFld              '' the source folder
	Dim strName             '' name of the source folder
	Dim i                   '' used as a dummy
	Dim iCount              '' how many messages have been stored in the output folder
	Dim totalMessagesRead
	Dim totalMessagesWritten

	iCount = 0
	totalMessagesRead = 0
	totalMessagesWritten = 0

	Const iCountmax = 16300 '' must be less than 16383, which is the number of messages that CAN be stored
                                '' per output folder in an ANSI PST

	strtype = Fld.fields(&H3613001E)

	'''' frankly, I don't understand the distinction below, it was in the
	'''' original code, but the two should be equivalent.
	If strType = "IPF.Contact" Then
		Set srcFld = Fld
	Else
		Set srcFld = RDOSession.GetFolderFromID(Fld.EntryID)
	End If
	strName = srcFld.Name

	For i = 1 to 3
		''' there are 3 collections in every folder that we might be interested in
		Select Case i
			Case 1
				Set objMessages = srcFld.Items
				msg "ProcessItems: " & strType & ": Processing Folder: " & strName & _
					" (contains " & objMessages.Count & " normal items)"
			Case 2
				Set objMessages = srcFld.HiddenItems
				msg "ProcessItems: " & strType & ": Processing Folder: " & strName & _
					" (contains " & objMessages.Count & " hidden/associated items)"
			Case 3
				Set objMessages = srcFld.DeletedItems
				msg "ProcessItems: " & strType & ": Processing Folder: " & strName & _
					" (contains " & objMessages.Count & " deleted items)"
		End Select

		fiCDOcount = objMessages.Count

		Set dfDestinationFolder = RDOSession.GetFolderFromID(doDictionaryObject.item(Fld.EntryID), PST.EntryID)

		For fiItemloop = 1 to fiCDOcount
			iCount            = iCount + 1
			totalMessagesRead = totalMessagesRead + 1

			If 0 = (fiItemLoop Mod 100) Then
				wscript.echo "... processing message " & fiItemLoop & " of " & fiCDOcount
			End If

			'' I SO wish VBScript had a Continue statement
			On Error Resume Next
			Err.Clear
			Set objMessage = objMessages(fiItemLoop)
			If Err.Number <> 0 Then
				msg "ProcessItems: corrupt message in folder, item number " & fiItemLoop & _
					" of " & fiCDOcount & ", 0x" & _
					Hex(Err.Number) & " (" & Err.Description & ")"
			Else
				On Error Goto 0

				pfPredictednewSize = CalcnewSize(pfPstFile, objMessage)
				If pfPredictednewSize >= tsize Then
					Wscript.echo "... additional 10 MB Exported, total size is now " & tsize & " MB" & _
						" (processing item #" & fiItemLoop & " of " & fiCDOcount & ")"
					tsize = tsize + 10
				End if

				If (pfPredictednewSize >= tnThreshold) or (iCount > iCountmax) Then
					msg "ProcessItems: " & strType & ": New PST about to be created - Destination - Number of Items : " & _
						dfDestinationFolder.Items.Count & _
						" (processing item #" & fiItemLoop & " of " & fiCDOcount & ")"

					Call CreateNewPst
					Set dfDestinationFolder = Nothing
					Set dfDestinationFolder = RDOSession.GetFolderFromID(doDictionaryObject.item(Fld.EntryID), PST.EntryID)

					iCount = 0
				End If

				On Error Resume Next
				Err.Clear
				objMessage.CopyTo(dfDestinationFolder)
				If Err.Number <> 0 Then
					Dim rdosrc

					Call ReportError ("ProcessItems: " & strType & ":", Fld, fiItemloop, "(copyto - likely fatal)")
					msg "ProcessItems: 0x" & Hex(Err.Number) & ": " & Err.Description
					Err.Clear

					''' Try to copy a slightly different way before giving up
					Set rdosrc = RDOSession.GetMessageFromID(objMessage.EntryId)
					rdosrc.CopyTo(dfDestinationFolder)
					If Err.Number <> 0 Then
						msg "ProcessItems: " & strType & ": (copyto): Also Failed RDO Copy"
						msg "ProcessItems: 0x" & Hex(Err.Number) & ": " & Err.Description
					Else
						msg "ProcessItems: " & strType & ": (copyto): Copied with RDO Okay"
						totalMessagesWritten = totalMessagesWritten + 1
					End If
					Set rdosrc = Nothing
				Else
					totalMessagesWritten = totalMessagesWritten + 1
				End If
			End If
			On Error Goto 0

			Set objMessage = Nothing
		Next
	Next

	msg "ProcessItems: " & strType & ": Source - Number of Items : " & totalMessagesRead & _
	    " Destination - Number of Items : " & totalMessagesWritten

	iMessageCount     = iMessageCount + totalMessagesRead

	Set dfDestinationFolder = Nothing
	Set objMessages         = Nothing
	Set srcFld              = Nothing
End Sub

Until next time...

If there are things you would like to see written about, please let me know!

Published Friday, October 16, 2009 12:57 PM by michael
Filed under: ,

Comments

Friday, October 16, 2009 4:21 PM by A better solution for 2+ GB PST Export « KAPes musings !

# A better solution for 2+ GB PST Export &laquo; KAPes musings !

Pingback from  A better solution for 2+ GB PST Export &laquo; KAPes musings !