Write instructions for self about patching dataset
Changelog, grammar, decrement count in README
Remove redundant lyric (Glowing Eyes)
You can play it here: One tøp song
1000. That's how many words there are that appear in only one twenty øne piløts song, give or take (see next section for more details). From obvious ones like "stressed", to words that you swear you heard elsewhere but aren't such as "went".
In this game you will be shown one word from the list, and asked to answer which song is it from. (Feel free not to answer 🙂)
I downloaded lyrics to all tøp songs from azlyrics.com with Python (which in retrospect seemed like a violation of their terms of service), saved them in .txt files, and wrote another Python script to locate words that appear in only one song.
If inflections of the same word are found in any other song, all of them, including the original form, are removed (with a third Python script of course), unless they are semantically or phonetically distinguishable. The criteria for phonetically indistinguishable inflections are -s, -es, -d, -ed and -ing. However, if they only appear in one song, all inflections are preserved.
For example, "sons" only appears in Addict With A Pen ("I haven't been the best of sons"), but because "son" appears in Taxi Cab, Clear, Polarize and No Chances, "sons" is removed from the dataset. "Weathered" (Chlorine) is semantically distinguishable from "weather" (Good Day, Migraine), so it is preserved. "Buy" (House of Gold) is phonetically distinguishable from "bought" (Car Radio, Levitate), so it is also preserved.
Also, words that are just track titles (e.g. heavydirtysoul) are removed.
For more information please consult files in
A word is shown above the textbox, where you try to guess which song the word came from. The textbox will guess what you mean (for example, typing "ho" narrows down the possibilities to House Of Gold, HOTY and Hometown).
Press Enter to select the first song listed (it should work on mobile keyboards too, just beware of autocorrect), or just click on any of them.
Click Next to see a new, random word. Click Hint to reveal which album the song is in (Self-Titled, RaB, RaB & Vessel, Vessel, Blurryface, Trench, SAI, Single); click again to reveal the words surrounding the one shown. Click Show Answer to reveal everything, including album, track, and the line(s) the word is taken from.
These songs, however, are not included:
"One tøp song" is made by fkfd entirely for entertainment purposes. I make absolutely no warranty. It is not a tool for gatekeeping. Not knowing lyrics to some tøp songs does not make you a fake fan (not knowing any, however, does). It is not a tool for scientific purposes either.
This website collects absolutely no data. I do not sell your information in those terms of agreement. No cookies either. Well in theory I could see your IP in the server logs, but I never bothered to check.
Twenty One Pilots and/or Fueled By Ramen are copyright owners of both the lyrics and the album covers. Fair use or something I'm not American.
This is not legally binding, but I feel I have the obligation to say this: pet cheetah. Thank you.
The following files are subject to the MIT license:
All Python files in
data/ are in the public domain.
data/1000 is under CC BY-SA.
I don't have any lawyer friends but what I know is no one can own non-trademarked words in the English language. On this ground, all words in the datasets are in the public domain, but the lyrics in the form of full lines are owned by TØP and/or FBR.
The dataset I'm using right now is 70% machine-generated and 30% manual labor. Re-generating it then doing all the work again is way beyond practicality. It has happened so many times I had to manually fix the dataset because of a mistake I made, but forgot to modify metadata. Therefore, I decided to put this checklist here for future me.
<td>element for track title
grepfor word in
data/, remove all occurrences in
scp *.html words.js firstname.lastname@example.org:www/toys/one_top_song/