mls data with barrett
TRANSCRIPT
![Page 1: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/1.jpg)
MLS DataBarrett Avery
![Page 2: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/2.jpg)
What is Data Acquisition at IDX?• Acquire data from over 600 MLS’s across US and Canada.• Employ various methodologies.• Sanitize and normalize the data .• Store the data for use by our client’s websites.•Maintain a constant vigil on all MLS feeds to ensure they are
running/updating properly.
![Page 3: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/3.jpg)
How we acquire MLS data at a high level.
Download• Via RETS, FTP, SFTP, SOAP, Xml Feed
Validate• Sanitize the data to ensure integrity and readability.
Map/Store
• Map data to be human readable.• Store in MySQL, NoSQL, Search Indexers.
Make Available
• Make data ready for display and search on client websites.• Maintain data reliability and availability.
![Page 4: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/4.jpg)
Why should you care?
• Data is everything, it is our content. Without data our websites would be nothing more than pretty templates.• Data is what our customers are searching for, it is what
powers the internet as we know it. • At the end of the day, it’s the data that is our bread and
butter.
![Page 5: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/5.jpg)
Some Stats• Total listings across 589 MLS’s: 3,826,086 in Platinum• With apx. 1,884,091 listings across 279 MLS’s in Original
• Translates to 90 Gigabytes worth of data in Platinum• With apx. 20 GB worth of data in Original
• Stored across 7 AWS (cloud) RDS database stacks for Platinum• 8 physical database servers for Original
![Page 6: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/6.jpg)
Main Technologies used in Acquisition
• PHP• MySQL• AWS (Amazon Web Services)• Laravel (PHP Framework – Platinum only)• NoSQL (Platinum only)• Search Indexers (Platinum only)• Node.js (Platinum only)
![Page 7: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/7.jpg)
How do we get all that data?
• RETS v1.5 through v1.8• FTP• SFTP• SOAP• XML Feeds
![Page 8: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/8.jpg)
Real Estate Transaction Standard (RETS)
• Custom built using the phRETS PHP library• Using the phRETS library, parses out the XML responses and
stores them in CSV format for later parsing and eventual storage in database.• Compatible with version 1.5 through 1.8
![Page 9: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/9.jpg)
FTP/SFTP
• Using custom FTP methods built in PHP• Downloading and parsing file formats such as:• Text (TXT)• Comma Separated Lists (CSV)• Tab or other control character delimited files (TXT)• Just about anything else that PHP can read
• Parse out raw files to be stored in database
![Page 10: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/10.jpg)
SOAP
•Using highly customized SOAP script written in PHP•Parsing XML data returned by SOAP requests into CSV
format for later parsing and storage.•Currently we only have one of these, NWMLS.
![Page 11: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/11.jpg)
XML
• Using a custom XML parser written in PHP• Parsing out the XML into CSV format for later parsing and
storage by the application.• These usually are on-off boards with their own set of rules
and layouts
![Page 12: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/12.jpg)
Normalizing Data• Sanitizing common things such as Booleans, dates, and numbers.• Associating codes with their respective long names (where
applicable).• Accommodating for non-standard formatting of data.• Adhering to MLS display rules, Map fields in the data to more human
readable fields.• Using robust database tools such as MySQL, NoSQL and Search
Indexers to ensure fast and secure storage of the data. Making the data displayable and searchable on client websites.
![Page 13: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/13.jpg)
Images• RETS
• Downloading from server.• Downloading Object URL’s.• Downloading Media Objects from RETS Resource.• Setting URL’s for images based off of MLS provided spec.
• FTP/SFTP• Download directly from server.• Download a list of URL’s to reference.• Download a list of filenames to download.
• SOAP• Download directly based off of date updated.
• XML• Various means, mostly directly from server.
![Page 14: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/14.jpg)
Geocoding (GIS)
•Currently using MapQuest•We store over 24 million valid geocodes•Updates everyday.•Approximately 44,000 new geocodes per day
![Page 15: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/15.jpg)
Putting it all together
•Download listing data.•Download agent/office data•Download media (images, virtual tours, open houses)•Associate all components by their unique ID’s• ListingID, InternalID, AgentID, OfficeID, MediaID
(Open Houses and Virtual Tours)
![Page 16: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/16.jpg)
Search the data
• Robust• Configurable• Fast• Accurate• Across multiple devices
![Page 17: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/17.jpg)
Future plans• Full Object Oriented architecture•Code to adhere to PSR standard• Full integration with NoSQL, Search Indexers and MySQL• This will provide much quicker searches as well as be
more scalable•Multi-Day updates for all RETS MLS’s• Sold Data for MLS’s that support and provide it
![Page 18: MLS Data with Barrett](https://reader031.vdocument.in/reader031/viewer/2022030300/588066b21a28ab0b098b6907/html5/thumbnails/18.jpg)