In this article, we will be using data from the Web Gallery of Art, a virtual museum and searchable database of European fine arts from the to centuries. The gallery can be accessed here.
We will create an algorithm to predict the name of the painter based on an intial set of features of the painting, and then gradually including more and more, thus improving the feature engineering, and including pictures.
Through this article, we will illustrate:
- The importance of good feature engineering;
- The importance of data enrichment; and
- The impact this can have on accuracy
Ready ? Let’s get started !
To download the data, you can either :
- click on this link to download the XLS file directly
- go to Database tab in the website, and click on the last link : You can download the catalogue for studying or searching off-line. Select the Excel format of 5.2 Mb.
I have published the full article on Explorium’s blog
Like it? Buy me a coffee