from IPython.display import HTML
HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide();
} else {
$('div.input').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()">
<input type="submit" value="Click here to toggle on/off the raw code.">
</form>''')
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import re
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import dask.dataframe as dd
import dask.bag as db
import dask.array as da
import dask.dataframe as dd
# pd.set_option('display.max_colwidth', None)
# import warnings
# warnings.filterwarnings("ignore")
import warnings
from warnings import simplefilter
from sklearn.exceptions import ConvergenceWarning
from sklearn import tree
from IPython.display import Image
warnings.filterwarnings("ignore", category=Warning)
EC Corro | SS Garcia | J Gonzales | CR Patalud
"Quick Draw!" is an online game developed by Google Creative Lab that guesses the drawing as the player draws the given image. It was observed that majority of the sketches were guessed after the second stroke, but the game may need up to more than 10 strokes before being guessed correctly. It was also found that within the dataset, only 12.01% of the sketches were guessed upon completion of the first stroke. With this, this study aims to answer the question: Can Quick, Draw! sketches be identified given only the first stroke?.
The Quick Draw! dataset used was obtained from Kaggle with size of 53.1 GB of data composed of 36 million drawings under 82 categories.
Our methodology involves data preprocessing using the AWS platform and use of dask library functions, exploratory data analysis and model development. We developed two models, (1) without grouping and (2) with grouping which places those that have an f-1 score of less than 0.3 in a single group.
Our results show that for without and with grouping, we obtained accuracy of 25.13% and 75.19%, This is higher than the computed 1.25xPCC of 2.11% and 68.13%. Comparing this to Quick Draws AI!'s performance in guessing sketches correctly after the first at 12.01%, our models were able to outperform this even without grouping at 25.13%.
In conclusion, the model can be integrated to Quick, Draw!’s underlying neural network to improve its speed in guessing a sketch. For future studies, we recommend the following:
This study will be valuable in the future of sketch prediction. We imagine using a single sketch as an input that would prompt possible recommendations to the user. Additionally, with the increasing use of smart phones and touch devices, this will also be useful to educators and students by speeding up the sketching process and focusing more on the lesson at hand.
The Quick, Draw! game is a game invented by Jonas Jongejan, Henry Rowley, Takashi Kawashima, Jongmin Kim, Nick Fox-Gieg, of the Google Creative Lab$^1$. A player is given a maximum of 20 seconds to draw an object such as apple, zebra, clock, or paper clip. The game relies on the data provided by previous players to learn more about the objects being drawn. One important feature of these sketches is the drawing stroke, which is defined as a continuous line before lifting the pen or writing material. Data on this includes the location and length of the stroke as well as the time it takes to draw the stroke.
Quick, Draw! utilizes neural networks to quickly guess the object being drawn, and players are amazed at how fast these images are being guessed, sometimes even before the 20 seconds are up, and sometimes even before the sketch is finished. While the Quick, Draw! game shows impressive results as it guesses the objects being drawn, it was found that only 13% of the sketches were guessed upon completion of the first stroke. Majority of the sketches were guessed after the second stroke but images can take up to more than 10 strokes before being guessed correctly.
In this study, we aim to answer the question: Can Quick, Draw! sketches be identified given only the first stroke? We will explore how the algorithm can be improved to produce faster results, specifically identifying sketches given only the first drawing stroke.
The Quick Draw! dataset was obtained from Kaggle$^2$ and was developed by the Google Creative Lab, which has 55 repositories administered by a group of developers, researchers and artists to explore, study and learn from. The raw dataset is a collection of 50 million drawings across 345 categories contributed by players of the game. The drawings were captured via timestamped vectors, tagged with metadata including the object to be drawn, and the player's origin.
The raw dataset available is in csv
format for each category and has the following columns:
column name | Data Type | Description |
---|---|---|
countrycode | String | Country Code where the drawing originated |
drawing | Array | Drawing array representing the drawing's stroke coordinates and time |
key_id | String | Unique key ID of the drawing |
recognized | Boolean | Boolean identification whether the drawing was successfuly recognized by the AI |
timestamp | Time Object | Timestamp when the object was drawn |
word | String | The Object Being Drawn by the Participant |
From the original dataset of 227 GB of drawings, we have a preselected dataset composed of 53.1 GB of data under 82 categories composed of 36 million sketches of organic living things. This varies from different types of animals,fruits and other types of living things which can be seen below. This was stored in AWS s3 bucket:s3://bdccdoodle/drawings_csv
s3://bdccdoodle/drawings_csv
¶%%bash
aws s3 ls s3://bdccdoodle/drawings_csv/ --human-readable --summarize
Our methodology involves the following steps:
From the preselected Doodle dataset stored in the S3 bucket, we used the dask library functions in preprocessing this big amount of data. We used the cloud thru the Amazon Web Services (AWS) platform to address the memory limitation when processing on our local computers.
In the preprocessing stage, since we are after predicting the sketch after the first stroke, we extracted only the coordinates of the first stroke of each sketch. We then separated its x and y coordinates and used these as features. We also computed and added stroke_num
and stroke _length
columns. All of the data precprocessing methods can be found in the Data Preprocessing.ipynb
notebook attached.
The final dataset used in this study can be found below after the data description:
column name | Data Type | Description |
---|---|---|
countrycode | String | Country Code where the drawing originated |
recognized | Boolean | Boolean identification whether the drawing was successfuly recognized by the AI |
word | String | The Object Being Drawn by the Participant |
stroke_num | Integer | Total Number of Strokes |
stroke_length | Integer | Length of the first stroke extracted |
x_n | Integer | x coordinates of the first stroke |
y_n | Integer | y coordinates of the first stroke |
df2 = dd.read_csv('s3://bdcc-doodle/draw_processed_csv/draw_processed_*.csv')
df = df2.compute()
df
We determined the number of successfully recognized drawings based on the first stroke in the dataset. It was found that only 12.01% of the total drawings were identified correctly after the first stroke.
df1 = dd.read_csv('s3://bdccdoodle/drawings_csv/*.csv',
error_bad_lines=False,
storage_options={'anon':True},
dtype={'drawing': 'object'})
#Getting the number of strokes
def get_num_strokes(data):
"""Return num_strokes set of stroke elements x, y, t coordinates in string
format.
"""
data = data[1:-1]
x = data.split(']], ')
x[-1] = x[-1][:-2]
x = [st + ']]' for st in x]
return len(x)
df1['stroke_num'] = df1.drawing.apply(lambda x: get_num_strokes(x))
df1[df1.recognized==True].stroke_num.value_counts()
df1.stroke_num.value_counts() #.compute()
df['stroke_num'] = df.drawing.apply(lambda x: get_num_strokes(x))
recog = df[df.recognized==True].stroke_num.value_counts().compute()
recog
recog.sum()
#Percentage of drawings that were correctly identified on the first stroke
percent_stroke1 = 1326364/recog.sum() * 100
print("The percentage of drawings that were correctly identified"
"on the first stroke is : %.2f%%" % round(percent_stroke1, 2))
Not all of the sketches were recognized by the Quick, Draw AI. Thus, from our dataset, we compared the number of sketches that wer recognized and not recognized by the Quick, Draw! AI game.
df.recognized.value_counts().plot(kind='bar', figsize=(10,5), fontsize=20)
To further imagine what the sketches look like, we used the draw_strokes
function to visualize the sketch. See an example of a frog
sketch as shown in Figure 4. The shown sketches are the first stroke of three different sketches of a frog.
def draw_strokes(data, item=None, count=20):
"""Plot drawings."""
data = data.reset_index(drop=True)
plt.figure(figsize=(30,10))
if item:
for idx in list(data[data.word==item].index)[:count]:
x = data.iloc[idx, 5: 90].values
y = data.iloc[idx, 90:175].values
plt.plot(x, y)
else:
for idx in list(data.index[:count])[:count]:
x = data.iloc[idx, 5: 90].values
y = data.iloc[idx, 90:175].values
plt.plot(x, y)
plt.title(item, fontsize=20)
plt.show()
return None
draw_strokes(df, item="frog", count=3)
This is where we develop a classification model for predicting the sketch based on the first stroke. This was done by using the validate_clf
function to obtain classification model results. Our target variable was the word
column and the rest were used features. We used a 75-25 train-test split on the Random Forest Classifier Model as it is most efficient model to train this dataset.
def validate_clf(data, model, df_eval):
"""Return classification model evaluation results."""
X_train, X_test, y_train, y_test = train_test_split(data.iloc[:,5:],
data.word,
random_state = 42,
test_size = 0.25)
model.fit(X_train, y_train)
new_index = len(df_eval)
df_eval.loc[new_index] = ['Random Forest Classifier',
model.get_params(),
model.score(X_train, y_train),
model.score(X_test, y_test)]
display(df_eval)
print()
print(classification_report(y_true=y_test,
y_pred=model.predict(X_test),
output_dict=False))
return (df_eval, classification_report(y_true=y_test,
y_pred=model.predict(X_test),
output_dict=True))
def draw_strokes(data):
"""Plot drawings."""
for idx in list(data.index):
x = data.iloc[idx, 2:152].values
y = data.iloc[idx, 152:302].values
plt.plot(x, y)
plt.show()
return None
# Run this to rest dataframe
cols = ['Classification Method',
'Parameters',
'Train Accuracy',
'Test Accuracy']
df_eval = pd.DataFrame(columns=cols)
To evaluate the model's performance, precision, recall and f-1 score were determined. After hyperparameter tuning our models parameters, we obtained model accuracy of $25.13\%$. This is a relatively good accuracy given that we are dealing with $81$ classes where each of the first stroke sketches belong to.
mod1 = RandomForestClassifier(n_estimators=100,
criterion='gini',
max_depth=18,
max_features='auto',
n_jobs=-1,
random_state=42)
df_eval, _crdict = validate_clf(df, mod1, df_eval)
To benchmark on how well our model performs, we computed for the Proportional Chance Criterion (PCC). Our model's accuracy of 25.13% is higher than 1.25xPCC at 2.11%.
# ALL
pcc = sum(((np.array(df.word.value_counts().values)) /
df.word.value_counts().sum())**2)
print("[all items] PCC =", pcc)
print("[all items] 1.25*PCC =", 1.25*pcc)
Even though our model already outperforms the 1.25 PCC, we also explored how to further improve its performance by grouping the categories that resulted to an f-1 score lower than 0.3. This evaluation metric was chosen as the basis for grouping as it accounts for both precision and recall which is its harmonic mean. The choice of f-1 score threshold is arbitrary. The intuition behind such action is that some classes have low f-1 score which means that first stroke sketches under such classes might be difficult to identify. Thus, they are wrongly classified to be under other classes. This also affects the f-1 score of other classes. Hence, placing classes with low f-1 score into a single placeholder may mitigate the issue of numerous wrongly classified sketches.
After identifying all those objects that have f-1 score lower than 0.3, we tag them as others
in the word
columns.
others = [x for x,y in list(_crdict.items())[:81] if y['f1-score'] < 0.3]
print(len(others))
others
df2 = df.replace(others, 'others')
df2.head()
After tagging the objects under classes with low f-1 score in the previous step as others
, the number of categories is now 18 including the others
category. We then again develop a model using this dataset and the validate_clf
function to find the performance of the classification model on a less noisy data. With this, we arrived at an accuracy of $75.19\%$, significantly higher than the previous accuracy without grouping. The intuition behind the significant increase in accuracy is that sketches that are hard to classify tend to fall under the others
class. This means that there are some classes whose elements are recognizable even just with the first stroke.
df_eval, _crdict = validate_clf(df2, mod1, df_eval)
We then computed the new 1.25xPCC of the new set of classes and got $68.13\%$ . Our model's accuracy still outperforms this at $75.19\%$ accuracy.
# with placeholder `others`
pcc2 = sum(((np.array(df2.word.value_counts().values)) /
df2.word.value_counts().sum())**2)
print("[with placeholder `others`] = ", pcc2)
print("[all items] 1.25*PCC2 =", 1.25*pcc2)
The summary of results of our study with and without grouping can be found on the table below. This shows that our models performs well given that its test accuracy is higher than the $1.25 \times PCC$ and the Quick, Draw! AI for both with and without grouping.
Evaluation Metric | Without Grouping | With Grouping |
---|---|---|
Model Accuracy | $25.13\%$ | $75.19\%$ |
Quick, Draw! AI | $12.01\%$ | N/A |
$1.25 \times PCC$ | $2.11\%$ | $68.13\%$ |
Note that the accuracy of the Quick, Draw! AI is based on the ratio of the number of recognized sketches in one stroke to the total number of recognized sketches in the data set.
This study can help in developing algorithms that will aid in the future of sketch prediction. Similar to a google search wherein you type some text and give google gives out suggestions, this study can pave the way in using drawing strokes to suggest possible drawings or images in the search.
In these times, the pandemic led to an increase in the use of touch devices and smart phones, thus sketch recognition and completion will be beneficial especially for online learning. This will help the educators and students have a better learning experience when conveying their knowledge through sketching on their devices.
Based on the results, we were able to develop a model that will predict a certain sketch based on first stroke. With an accuracy of 25.13% for without grouping and 75.19% without grouping, our model can predict sketches while outperforming the 12.01% performance by Quick, Draw!’s AI. With this, we believe that the model can be integrated to Quick, Draw!’s underlying neural network to improve its speed in guessing a sketch.
For future studies, we recommend the following:
[1] Google Crealive Lab. (2017). Quick, Draw! Retrieved from https://experiments.withgoogle.com/quick-draw
[2] Kaggle. Quick, Draw! Doodle Recognition Challenge. Retrieved from https://www.kaggle.com/c/quickdraw-doodle-recognition/discussion/73738