Migrating Salesforce ContentDocuments (Files) - EKWIS

Background

Data migration can be a daunting task, particularly when it comes to transferring files within Salesforce. This comprehensive guide is designed to assist anyone who needs to accomplish this crucial step.

Although our focus is primarily on exporting files from one Salesforce org to another, we have also included notes on how the process may vary if you are migrating files from an external system or directly from your local computer.

Required Tools

This guide is based on using a MacBook, but the process should be similar for Windows users. Keep in mind that you may need to find equivalent commands for instances where the terminal is mentioned.

  1. Salesforce Data Loader
  2. Microsoft Excel
  3. Terminal

Exporting Data from the Old Org

To export files, use Salesforce’s Data Export feature, which can be found by searching for “data export” in the Setup menu.

Make sure to select the “Include Salesforce Files and Salesforce CRM Content document version” checkbox.

Export only ContentVersion and ContentDocumentLinks, as no other objects are required. The necessary details are contained within ContentDocumentLinks.

Upon completion, you will receive multiple zip files, each containing approximately 500MB of data (unless your org has less than 500MB of data in total).

The following files and folders will be included:

  1. ContentVersion folder(s): Contains the files, so there may be multiple folders.
  2. ContentVersion.csv: Provides the connection between the file and the file’s data.
  3. ContentDocumentLink.csv: Establishes the link between the file and the content version.

NOTE: Only the first created zip file will contain the CSVs. All subsequent zip files will contain ContentDocuments exclusively.

Organizing the Salesforce Exports

To keep everything organized, unzip all ContentVersion folders into a single ContentVersion folder on your local drive.

For Mac users, the following command can be helpful:mv -iv folder1/* folder2/

This command moves all files from “folder1” to “folder2.” Use this command for each ContentVersion folder, with “folder1” representing the source folder and “folder2” representing the destination ContentVersion folder.

Your final folder structure should look like this:

ContentVersionContains all the files that need to be uploaded

CSVs contain all the CSV I used to sort the data although I’m only uploading AccountContentVersion.csv in this example

The original zips folder I kept as a backup of the downloaded data from Salesforce

Handling Data that Didn’t Originate from Salesforce

If you’re uploading files that didn’t come from a Salesforce org, the overall structure remains the same. All files for upload should be placed in the same folder. However, generating the PathOnClient and VersionData paths required for the import may be more challenging. We recommend using the terminal on a Mac to print these paths by following these steps:

  1. Open Terminal.
  2. Run cd, then drag the folder containing all your files into the Terminal window to quickly obtain its path.
  3. Run pwd to ensure you are in the correct directory.
  4. Run find . -type f.

The Terminal will then display all the file paths, which you can copy and paste into your import file.

Creating the First Import/Upsert File

Begin by creating a new ContentVersion.csv file based on the one you exported earlier. You only need to include the following columns:

VersionDataLegacy_Salesforce_ID__cPathOnClientTitleDescriptionOwnerIdFirstPublishLocationId

TIP: To make the migration process more efficient, consider migrating one object at a time. You can achieve this by filtering the ContentDocumentLink file based on the first three digits of the LinkedEntityIds. This allows you to use the VLOOKUP function in the ContentVersion file to create a new file containing only the records for the specific object. By doing so, you can limit the file space usage to key objects that your customer wants to import.

This table provides a description of each column:

Field NameTypeDescription
TitleRequiredThe title that will appear in the files related list
DescriptionOptionalA description of the file
PathOnClientRequiredThis should be the content of the PathOnClient field from the ContentVersion.csv with the file path on your local machine added before it E.G:/Users/mattknight/Downloads/FileMigration/ContentVersion/worddoc.docx
VersionDataRequired/Users/mattknight/Downloads/FileMigration/ContentVersion/0684G000005lvNnQAIThe ID is the ID column from the ContentVersion.csv fileIf you are uploading files that have not been exported from salesforce then you would use the same value as the ParthOnClient to locate the document.
 Legacy_Salesforce_ID__cOptional (But required to upload ContentDocumentLinks & Update Owners easily)This should be the ContentDocumentId from the ContentVersion.csv file
FirstPublishLocationIdOptional (Only needed when uploading a File to a single record if you plan to upload files to multiple records ignore this field.)Id of the record for the file to be initially shared with if the file is only shared with one record, n my example this is an account id from the new org, I could populate this by using the data in the ContentDocumentLink.csv file and a few vlookups.Auto links the file to a record on the insert and allows us to set a custom owner id as the file is being uploaded to a shared area
OwnerIdOptional To Ensure the file belong to the correct person in salesforce and is visible in “My Files“

Here is an example of how my CSV looked before upload:

VersionDataLegacy_Salesforce_ID__cPathOnClientTitleDescriptionOwnerIdFirstPublishLocationId
/Users/mattknight/Downloads/FileMigration/ContentVersion/0684G000005lw1nQAA0694G000006Qke0QAC/Users/mattknight/Downloads/FileMigration/ContentVersion/worddoc.docxBank Meeting Template 005D0000005iqdCIAQ0013H00000Xl5j3QAB

Uploading the File

The following steps should be familiar to those who have experience using the Salesforce Data Loader:

  1. Open the Data Loader and navigate to the settings. Set the batch size to 1.
  2. Choose either “Insert” or “Upsert” as the operation.
  3. Use the new ContentVersion.csv file to insert or upsert files into the target org’s ContentVersion object, mapping all the necessary fields mentioned earlier.
  4. Upload the file.
  5. Repeat this process for each object that you are importing.

NOTE: Ensure that your computer doesn’t go to sleep or shut down during this process, as it may take a substantial amount of time to complete.

To monitor the status of the upload, check some of the first Publish Locations in your CSV. As the upload progresses, you should see something similar to the following:

The files all have previews and appear correctly in a records Files related list.

Uploading Files to Multiple Records (ContentDocumentLinks)

If you didn’t use the FirstPublishLocationId during the initial import, follow these additional steps to associate your files with the correct records in Salesforce:

NOTE: To determine if this process is required, check for duplicate ContentDocumentIDs in your originally exported ContentDocumentLink.csv file. If duplicates exist, it means some files are related to multiple objects.

  1. Upload the ContentVersion.csv as specified earlier, ignoring the OwnerId and FirstPublishLocationId. Remember to use a batch size of 1.
  2. Create a new ContentDocumentLink.csv file based on the old one. You will need to perform a VLOOKUP for each object type you are linking to. Here’s one method:
    1. Use =LEFT(A2,3) to identify the object identifiers.
    2. Filter out the objects you don’t want to link.
    3. Run a VLOOKUP for each object, bringing in its ID from the new org.
  3. You will also need the legacy Salesforce ID from the ContentVersion records you uploaded to find the new ContentDocumentIds. a. Use the following query to obtain the ContentDocumentId for the ContentDocumentLink upload: SELECT Legacy_Salesforce_ID__c, ContentDocument.Id FROM ContentVersion b. Alternatively, you can use the title of the ContentDocument in an upsert operation, but this may be less reliable.

Before uploading, your file should look like this:

LinkedEntityIdContentDocumentIdShareTypeVisibility
0018d00000R2XUaAAN0698d00000EKnAkAALVAllUsers
  • Here is an example with all the IDs that can assist with the VLOOKUPS.
LinkedEntityId OLDLinkedEntityIdObjectContentDocument:Legacy_Salesforce_ID__cContentDocumentIdShareTypeVisibility
00120000002iYWhAAM0018d00000R2XUaAANAccount0694G00000Qbzx6QAB0698d00000EKnAkAALVAllUsers
  • Finally, you can reassign the ContentVersion records created earlier by updating or updating or upserting the ContentVersion Object:
Legacy_Salesforce_ID__cOwnerId
0694G000006Qke0QAC005D0000005iqdCIAQ

One Reply to “Migrating Salesforce ContentDocuments (Files)”

25/02/2023
Reply

Love your work

Leave Your Reply

Your email address will not be published. Required fields are marked *