How to ingest premium market data with Zipline Reloaded

How to ingest premium market data with Zipline Reloaded
This article explains how to build the two Python scripts you need to use premium data to create a custom data bundle using Zipline Reloaded.
Step 1: Subscribe to premium data
By now you should already have an account with Nasdaq Data Link. If not, head over to https://data.nasdaq.com and set one up.
You're looking for QuoteMedia End of Day US Stock Prices. This product offers end-of-day prices, dividends, adjustments and splits for US publicly traded stocks with history to 1996. Prices are provided both adjusted and unadjusted. The product covers all stocks with primary listing on NASDAQ, AMEX, NYSE, and ARCA.
You can find the page to subscribe here: https://data.nasdaq.com/databases/NAEOD
Once subscribed, you'll be able to use it through your API key.
Step 2: Create/Edit extension.py
Now we'll create the two files we need to create the bundle.
For Windows users
In the .zipline directory, you will store the extension.py file, which informs Zipline about the custom data bundle.
- Open the File Explorer and navigate to your home directory. You should find the .zipline folder there. If you're not sure where your home directory is, it's usually C:\Users\[YourUsername].
- Open the .zipline folder.
- Right-click within the folder, select New, then choose Text Document. Rename the newly created file to extension.py. Make sure you change the file extension from .txt to .py.
Note: If you can't see file extensions in your File Explorer, you'll need to enable them. To do this, click on the View tab in File Explorer, and then check the box for File name extensions.
For Mac/Linux/Unix users
- Open Terminal: You can do this by searching for "Terminal" using Spotlight (Cmd + Space) on Mac or by accessing it from the Applications folder.
- Navigate to .zipline Directory: By default, the terminal opens in your home directory. To ensure you're in the home directory and then navigate to the .zipline directory, you can use the following commands:
cd ~ cd .zipline - Create/Edit the extension.py File:
- If the file doesn't exist: You can create it using the touch command followed by opening it with a text editor of your choice.
touch extension.py - If the file already exists: Simply open it with a text editor.
- If the file doesn't exist: You can create it using the touch command followed by opening it with a text editor of your choice.
For all users
Within the editor, you can now proceed to input or edit the necessary content. In the file, add the following content:
Save and close the file
Step 3: Create the code to build the bundle
Use the instructions above to create a file called daily_us_equities.py.
In the file, add the following code exactly as is (do not alter!):
The format_metadata_url function constructs the URL for querying Nasdaq Data Link based on a provided API key and selects specific columns of data to retrieve, including ticker information, date, and price metrics.
The fetch_download_link function attempts to retrieve the actual data download link from Nasdaq Data Link. This link is dynamic and can change, so the function continually checks the status of the data until it is ready for download. If the data isn't ready after a certain number of tries (defined by MAX_DOWNLOAD_TRIES), the function waits for a set interval before trying again.
load_data_table extracts and processes data from a downloaded ZIP file. It assumes the ZIP file contains a single CSV file, from which data is read into a Pandas DataFrame. The columns are renamed to be compatible with Zipline's naming conventions.
In the fetch_data_table function, the data table is fetched by constructing the appropriate metadata URL and then downloading the data, leveraging the previously mentioned functions.
Subsequent functions like gen_asset_metadata, parse_splits, parse_dividends, and parse_pricing_and_vol provide parsing and transformation capabilities to process the raw data into a format suitable for Zipline. They generate asset metadata, handle stock split and dividend data, and parse pricing and volume data, respectively.
The core function, daily_us_equities_bundle, integrates all the functionalities to fetch and prepare the QuoteMedia End of Day US Stock Prices dataset for Zipline's consumption. It checks for the required API key, fetches the raw data table, processes it, and writes the formatted data to disk. This function is the primary interface that a user or system might call to get Quandl data into Zipline's bundle format.
Lastly, the download_with_progress functions facilitate the actual data download. The function provides a visual progress bar for tracking download progress. It returns the downloaded data as a BytesIO object, making it easier to subsequently process or store the data.