Data archiving

CodeTwo Backup saves data in local repositories called storages. For safety reasons, the backed-up data can be stored only on local disk drives. As the volume of the backed-up data increases, the storage volume increases. To avoid situations in which all of your local disk drives are full and your systems are strained, CodeTwo Backup features data retention and archiving functionalities.

Archive jobs

Archive jobs are used to create exact (1:1) copies of storages. Combined with the program's data retention policy, they form a control mechanism for the storage data growth: only the most recent backup data is kept locally while the rest is stored on an external storage device.

Archive jobs are defined separately for each storage by using an archive job configuration wizard. Each storage is treated as a whole, so an exact copy (snapshot) of the current storage state is created in the archive process. You can also select to archive only Exchange or only SharePoint data backed-up in a given storage. The program archives all backed-up versions of Exchange and SharePoint items as well as the folder structure of Exchange mailboxes and SharePoint sites.

Info

Every time an archive job is started, a completely new archive is created – the incremental backup mechanism is available only for the backup jobs, not for the archive jobs.

Archived Exchange data is saved as versioned XML files and (always encrypted) FTS binary data files. Exchange archives resemble the folder structure of mailboxes and each item is archived into a separate file. Encrypted binary files are also used to store SharePoint archives. However, the entire backed-up SharePoint data (site collections, team sites and OneDrive sites) is archived into a single or multiple DAT files (the size of which can be adjusted). Data format conversion during an archiving (and importing) process is transparent for the user. Moreover, archives keep metadata about their own structure and version, so in any future versions of the program it will be possible to import old archives and convert them into current formats.

When you create an archive, the specified target location also contains certain configuration files. These files cannot be deleted because they ensure the proper operation of CodeTwo Backup. ArchiveOwner.xml, for this example, prevents a user from reusing a location which already contains an archive. Each archive job must use a separate target folder for the archived data - it is not possible to share an archive location between multiple archive jobs. 

Other properties of archive jobs:

  • Contrary to storages, archives can be kept on removable or network drives.
  • Archive jobs cannot be paused and resumed. Once started, the archive process has to finish. If stopped by force, an archive will be considered corrupted and automatically removed by the program.
  • If an archive job is started, the storage retention policy is suspended and resumed after the job is complete.
  • Backup jobs can be performed at the same time as archive jobs.
  • Indexer data for a particular storage is not archived. The Indexer will re-index all data in a new storage once an archive is imported.
  • You can protect an archive by a password so that only you can import it.
  • Aside from the storage retention policy that, for example, deletes older Exchange and/or SharePoint items from the storage, you can also configure an archive retention policy to manage the number of created archive versions.
  • Archive jobs can be scheduled.

PST archive jobs

Exchange data can also be archived to PST files. As PST is a proprietary file format of Microsoft, archiving to PST comes with some limitations:

  • Archiving to PST is only available for Exchange data.
  • A PST archive job creates a separate PST file for every mailbox and public folder selected from a storage. It is not possible to archive a whole storage, containing data from multiple mailboxes, to a single PST file.
  • PST files are not encrypted. Despite the fact that CodeTwo Backup features password protection for PST archives, this should not be considered a solution for the safe storage of data. PST archive password protection uses a standard defined by Microsoft for this file type. Unfortunately, the methods defined by PST file type standard were cracked a long time ago. It is not difficult to find software that breaks PST passwords literally within seconds. CodeTwo did not implement better password protection to PST archives because if we did that, it would not be possible to import such PST files to MS Outlook.
  • Only a current state of a mailbox/public folder from a storage is copied to a PST archive. There is no item versioning because it is not allowed by the PST file format.
  • Archiving data to PST files is a one-way operation, i.e. it is not possible to import data from a PST file back to a storage.
  • PST archive file cannot be larger than 10 GB, so if there is more data to be archived, the archive job will automatically create multiple PST files, each no bigger than 10 GB. This is by design, even though MS Outlook accepts PST files as large as 50 GB. During our extensive testing, we discovered an inconsistent behavior of Outlook when importing PST files bigger than 10 GB. Also, due to the imperfections of the PST format, PST files corrupt rather easily. The problem becomes more visible for larger PST files. Therefore, for data safety, we limited the archived PST file size to 10 GB. The program doesn't divide the data between the PST files by following any particular pattern. Mailbox data is simply copied folder by folder to a single PST file, and once its size reaches 10 GB, another PST file is created.
  • PST archives were implemented to allow easy import of archived items directly to MS Outlook. Considering this and the above-mentioned limitations, you probably should not think of PST archives as a proper and safe long-term data archiving method.

Except for the above, PST archive jobs work in a similar way to standard archive jobs and are configured using a wizard as well.

Importing archives

Standard archives (not the PST ones) can be imported, to review or restore old data. Importing works as a one-time job and, as a result, it creates a new storage with imported items. Importing to an existing storage is prohibited to avoid item version conflicts. Imported items are always stamped with a current date and time. Learn how to import archived storages

In this article

Was this information useful?