Need Python spider(s) written/run that will create XML file(s) that meets these requirements:
Using a list of ~29k zip codes, which I will provide as a .csv, I need the location data for all Champs Sports stores in the USA. I recommend using their store locator:
[login to view URL]
Feed the zip codes into this store locator and scrape the resulting search results pages. Most of the information can be scraped from the initial search results, but some info like store hours and latitude/longitude must be extracted from external sources, such as Google Maps API.
This process will likely return duplicate results, which must be removed.
Sample XML format:
<stores>
<store>
<merchantId></merchantId>
<storeId>1504</storeId>
<address>Las Catalinas Mall</address>
<city>Caguas</city>
<region>PR</region>
<postalCode>00725</postalCode>
<lat>18.205647</lat>
<lng>-66.034454</lng>
<phone>787-258-0110</phone>
<hours>10:00am - 9:00pm Monday - Friday, 11:00am - 6:00pm Saturday, 11:00am - 6:00pm Sunday</hours>
</store>
<store>
...
</store>
</stores>
In addition to the XML file we need now, we'd like to be able to run this spider for additional queries/pages. Please package it up for us so that we can run it in the future.
## Deliverables
I will provide the merchantId shortly.