💡 Scrapy is a useful web-crawling framework in Python.
Scrapy can handle static websites, a static website is a website with fixed content coded in HTML and displayed in a browser exactly as it is stored.
This article will show you how to set it up!
How to Install Scrapy Splash?
First off, let’s look at how to install and set up splash.
There is a little more to this than just installing the python package using pip.
To run splash, a software named docker is needed.
🎓 Docker is an open-source containerization platform. It enables developers to package applications into containers, standardized executable components combining application source code with the operating system libraries and dependencies required to run the code in any environment.
Use this link to download docker:
After docker is installed and you can start the docker app, execute the following command in a shell.
This will download the splash docker image.
docker pull scrapinghub/splash
After that, in the docker app, select images,
scrapinghub/splash should now be available there, like in the image below. From here, press the run button on the right of the image.
Then this window will appear, press the optional settings to expand it.
Fill in the name you want for the container, I simply used “splash” for mine.
The “Local host” field will also need to be filled in. It suggests 8050 by default so I decided to go with that. After these fields are filled in, press the run button in the lower right corner of the window.
In your docker app, navigate to Containers / Apps, the splash container should now appear, like this.
To make sure everything is running as it should, either start a browser and type in
http://localhost:8050/. Or press the button that says open in browser like in the image above, that will start your preferred browser and search for
If everything is well and working, then this site should appear.
I will also include a link to splash in references on how to install docker and set it up to use splash
Now it’s time to install the splash package using
pip. Run the following command in the shell in your environment of choice to download and install splash.
pip install scrapy-splash
scrapy-splash has been successfully installed, everything should be good to go.
Where to Go From Here?
You can now dive into our tutorial on how to scrape dynamic websites using scrapy-splash here: