Content migration for BPI
The challenge
Migrate to a WordPress platform, 3 websites (multilingual) that had been built with Jahia, in 2013 and 2014.
The technical solution
Gather the necessary skills and use a structured, methodical approach.
Discover the BPI x Be Api story
How to successfully complete a technical migration assignment?
At the heart of the Bibliothèque publique d'information (Bpi) project was the technical migration of BPI sites to WordPress :
These sites had been built using Jahia CMS in 2013 and 2014 (3 sites x variable language versions for each site).
To carry out this type of content import, you need :
A good knowledge of the technical structure of the source data (Jahia in this case) and the ability to generate structured flows with this data.
A good knowledge of the landing structure in a WordPress multisite and the skill to organize and "tidy up" this data for new sites.
Always bearing in mind that this data will be used for a variety of purposes:
- Display in different shapes and templates
- Features: sorting, search, links...
But also respecting structuring and quality constraints to meet accessibility, natural referencing and potentially RGPD compliance objectives.
Our approach
This is what we built with BPI during the preparation and organization workshops for this mission to migrate content from Jahia to WordPress.
Starting data
The BPI had planned to involve a service provider with a good knowledge of the Jahia CMS and legacy data. This is often necessary in this type of project, as it enables us to obtain feeds or exports that offer the maximum amount of information, and in a form that is fully usable at the time of import.
This is also the purpose of the "reversibility clauses" that you sometimes see in contracts or proposals from your agencies.
At Be API, we have often had to migrate sites from Joomla or Drupal to WordPress. In such cases, we propose to bring in an external partner who is an expert in Joomla or Drupal. To be completely transparent, we often suggest to the customer that this partner be present during the workshops, to help us collectively "get the best out" of the old site's data.
Arrival data
At this stage, the project team can describe all the data "landing fields" through the "content structuring" concepts that are well known on WordPress :
- Content types (CPT - Custom Post Type)
- Fields associated with a content type. For this project, they were created using ACF (Advanced Custom Fields). These fields are "typed": text fields, image fields, date fields, select fields, URL fields...
- Relationships between content types (P2P - Posts to Posts)
- Taxonomies: list of terms used to categorize content.
The creation of this structure in WordPress also enables the customer to visualize and check the back-office of his new sites. This entire back-office structure is visible early on in the project.
- Starting data (with its characteristics)
- Treatment(s) to be carried out
- Data imported into WordPress (with its characteristics).
The import engine
The import engine is a program in the form of a WordPress plugin (more precisely, a WP-CLI command). Starting from a source data stream, this command performs the transfer described in the correspondence table. Finally, the data is imported and stored in the host structure provided by WordPress.
This import engine also generates a file of redirects from the old URL structure to the new one, to facilitate the work of the agency's SEO department.
Checks and controls
The first import tests are carried out with a test dataset. The recipe is shared between Be API and BPI. For the import of data from the BPI sites, we were able to rely on a customer with a thorough knowledge of its database and the "business logic" of the sites' content. The BPI was thus able to assess the quality of the import by checking for "special cases" and "exceptions", thus making this acceptance phase even more reliable.
Batch import of content
Now that the import engine has been validated, we can agree on an import strategy. In agreement with BPI, we decided to carry out successive imports , in batches and incrementally.
This means that each import adds new data to the already imported database, without affecting data that had been imported previously.
This approach is also justified by the fact that the BPI took advantage of the migration period to modify/improve/homogenize certain imported content.
The BPI made this contribution through the WordPress back-office. Obviously, we had to avoid overwriting the enrichments made by the BPI at all costs.
Just before the new sites go online, only the most recent content remains to be imported to ensure that the database is complete and up to date.
In conclusion
This can be a technically complex assignment, requiring good organization and a rigorous methodology.
For this stage to be successful, it's essential to exchange ideas (through upstream workshops) and to establish good cooperation between all those involved. This is necessary to guarantee a good level of competence at every stage.
It's true that over the past 10 years, we've carried out a good number of content migration missions to WordPress from a wide variety of sources (CMS or other).
This constitutes a wealth of experience that enables us to tackle this type of mission with confidence.