migrate without migranes
DESCRIPTION
In this talk, we'll look at the tools and modules available for migrating content into Drupal. I'll describe the workflow I've used to prepare, transform, and import thousands of records into Drupal. I'll share strategies for cleaning up and parsing data and doing it in a reliable, repeatable manner. You'll learn how to efficiently use PHP, Feeds, and Feeds XPath Parser modules to handle almost any data source thrown your way.TRANSCRIPT
![Page 1: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/1.jpg)
Oscar Merid
Aug. 1, 2014
Migrating without Migraines
![Page 2: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/2.jpg)
Musketeers.me
Content is nomadic
![Page 3: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/3.jpg)
Musketeers.me
Managing Content Migrations
• Technical process features
• Automated
• Safe
• Adaptable
• See also “Hitch your Wagon”, 2013
• http://phpa.me/hitch-your-wagon
![Page 4: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/4.jpg)
Musketeers.me
AutomatedPhoto by pubsubhashis
![Page 5: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/5.jpg)
Musketeers.me
Safe
![Page 6: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/6.jpg)
Musketeers.me
Adaptable
![Page 7: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/7.jpg)
Musketeers.me
Migration Workflow
Sources XMLDocuments CMS
![Page 8: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/8.jpg)
Musketeers.me
Built from many sources
• One or more database tables
• … or CSV, or XML, or…
• Reference other content
• Reference image, audio, & other files
• Reference other systems
• for example, Youtube for video
![Page 9: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/9.jpg)
Musketeers.me
Transform to XML
• Represent fully built content and attributes
• Clean up errors
• Character encodings, HTML entities
• Filenames & paths
• Assign a unique identifier
![Page 10: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/10.jpg)
Musketeers.me
Import to Drupal
• Use Feeds & Feeds XPath Parser modules
• http://drupal.org/project/feeds_xpathparser
• also JSON Parser, and others
• UI to map XML to entity attributes & fields
• UI for importing, deleting content
![Page 11: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/11.jpg)
Musketeers.me
What else do we need?
• Access to source(s)
• Command line PHP to transform it to XML
• https://github.com/omerida/importtools
• … patience.
![Page 12: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/12.jpg)
Musketeers.me
Importing Author Profiles
• We need to import profiles for authors on our site.
• Authors are not users, just a biographical profile (content type)
• Data is provided via a CSV file
![Page 13: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/13.jpg)
Musketeers.me
Sample CSV data!first name,last name,email,city,country,date_joined,company,bio,id,tags
Whilemina,Benton,[email protected],Ansfelden,Sri Lanka,12/25/13,Vivamus Institute,"amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. Duis cursus, diam at pretium aliquet, metus urna convallis",1,"contributor, author, "
Kim,Sellers,[email protected],Cabo de Santo Agostinho,Burundi,06/26/14,Maecenas Ornare Foundation,"eget massa. Suspendisse eleifend. Cras sed leo. Cras vehicula aliquet libero. Integer in magna. Phasellus dolor elit, pellentesque a, facilisis non, bibendum sed, est. Nunc laoreet lectus quis massa.",2,
![Page 14: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/14.jpg)
Musketeers.me
1. Parse CSV
// read in our Sample CSV file // and clean up incoming data with sampleParser $csv = new readCsv(__DIR__ . '/sample.csv'); $csv->setKey('id'); $items = $csv->getArray('ProfileParser');
![Page 15: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/15.jpg)
Musketeers.me
2. Clean up incomingfunction profileParser($item) { // skip profiles without an email if (empty($item['email'])) return false; ! // create a first+last item $item['last_first'] = $item['last_name'] . ', ' . $item['first_name']; ! // cleanup & split the tags column into an array if (isset($item['tags'])) { $tags = explode(',', $item['tags']); $tags = array_filter($tags); $tags = array_map('trim', $tags); $item['tags'] = $tags; } ! // clean date_joined format $date = new DateTime($item['date_joined']); $item['date_joined_clean'] = $date->format('Y-m-d'); ! return $item; }
![Page 16: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/16.jpg)
Musketeers.me
3. Convert to XML// output XML $xml = new toXml("profiles", "profile"); $xml->setHandler("tags", "tagHandler"); $xml->convert($items); echo $xml->saveXML();
![Page 17: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/17.jpg)
Musketeers.me
4. Save XML Output$ php csv2xml.php > profiles.xml
<?xml version="1.0"?>!<profiles>! <profile>! <first_name>Whilemina</first_name>! <last_name>Benton</last_name>! <email>[email protected]</email>! <city>Ansfelden</city>! <country>Sri Lanka</country>! <date_joined>12/25/13</date_joined>! <company>Vivamus Institute</company>! <bio>amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. !Duis cursus, diam at pretium aliquet, metus urna convallis</bio>! <id>1</id>! <roles>! <role>contributor</role>! <role>author</role>! </roles>! <last_first>Benton, Whilemina</last_first>! <date_joined_clean>2013-12-25</date_joined_clean>! </profile>
![Page 18: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/18.jpg)
Musketeers.me
Our Profile Content Type• Text fields: First Name, Last Name, City,
Country, Company
• Email field: E-mail
• Date field: Date Joined
• Long Text: bio
• List: Tags (roles)
• Integer: Legacy ID
![Page 19: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/19.jpg)
Musketeers.me
![Page 20: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/20.jpg)
Musketeers.me
Step 4. Configure Feeds
• Basic: Disable periodic import
• Fetcher: File Upload
• Parser: XPath XML Parser
• Processor: Node processor
![Page 21: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/21.jpg)
Musketeers.me
5. Configure Processor
• Settings
• Bundle: Target content type
• Update: Update existing nodes
• Text format: HTML
• Expire Nodes: never
![Page 22: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/22.jpg)
Musketeers.me
6. Map Source to Target
![Page 23: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/23.jpg)
Musketeers.me
7. Map Inputs with XPath
![Page 24: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/24.jpg)
Musketeers.me
An XPath Primer
• Used to query XML documents
• A path can return multiple nodes
• /profiles/profile - give me all profile nodes
• Can test for attributes, elements, and more
• http://github.com/GeorgeMac/xpath-primer
![Page 25: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/25.jpg)
Musketeers.me
8. Run the Import
![Page 26: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/26.jpg)
Musketeers.me
Advanced Feeds Tricks
• Provide a URL to an image, audio, or other media file, and it’ll be downloaded.
• Can create entity references
• As long as your GUIDs are set
• Can import to Field Collection
• https://drupal.org/project/field_collection_feeds
![Page 27: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/27.jpg)
Musketeers.me
hook_feeds_presave• Clean up data on import
• Can also use regular node hooksfunction grad_importers_feeds_presave(FeedsSource $source, $entity, $item) {! if ('publications' == $source->id) {! // ensure yes/no fields are imported! $entity->field_submitted_web[LANGUAGE_NONE][0]['value'] = (int) $item['xpathparser:5'];! $entity->field_media_promotion[LANGUAGE_NONE][0]['value'] = (int) $item['xpathparser:6'];!! // don't lose freeform notes but import the values cleanly! $notes = array_filter(array($item['xpathparser:8'], $item['xpathparser:12']));! $entity->field_history_notes[LANGUAGE_NONE][0]['value'] = join("\n", $notes);! }!}
![Page 28: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/28.jpg)
Musketeers.me
hook_feeds_presave II• Extract data from a field, assign it to another
function foo_feeds_presave(FeedsSource $source, $entity, $item) {! if ('news_feed' == $source->id) {! // get link and title out of the description field! $doc = new DOMDocument();! $x = @$doc->loadHTML($item['xpathparser:2']);!! $links = $doc->getElementsByTagName('a');! if ($link = $links->item(0)) {! $entity->field_link['und'][0]['title'] = $link->nodeValue;! $entity->field_link['und'][0]['url'] ! = $link->attributes->getNamedItem('href')->nodeValue;! }! }!}
![Page 29: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/29.jpg)
Musketeers.me
Creating an Entity Reference• Map a source to
a Feeds GUID
!
!
• Set an XPATH query to read the GUID
![Page 30: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/30.jpg)
Musketeers.me
Entity Reference XML
<scholars> ! <scholar>! <scholar_id>b36c6b08b402c12bd3a11657420cd5dd</scholar_id>! <scholar_name>Sammy Zahran, PhD</scholar_name>! </scholar>! </scholars>! <program_id>ee01cb8d56dbd67bf89c3c4fcf69e2f5</program_id>! </publication>
![Page 31: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/31.jpg)
Musketeers.me
What did we learn?
• Decouple your source from import
• Sources will change … versioning!
• Transform input sources to XML
• Clean up data with PHP before it gets to Drupal.
• Quicker &Easier to run more than one import
![Page 32: Migrate without migranes](https://reader033.vdocument.in/reader033/viewer/2022042613/540d865b8d7f728d7e8b49ac/html5/thumbnails/32.jpg)
Thank You.• @omerida on twitter
• php[architect] - http://phparch.com
• Monthly magazine - write for us!
• Books, trainings, and more…
• php[world] - this November
• http://world.phparch.com
• Questions?