LyCra - Lyrics Crawler¶
LyCra stands for Lyrics Crawler and provides an environment for crawlers to search for lyrics for songs in the database.
A crawler is a Python script in the crawler directory lib/crawler
.
It is a derived class from lib.crawlerapi.LycraCrawler
as shown in the class diragram below.
The name of the crawler file must be exact the same name the class of the crawler has.
If the crawlers’ filename is Example.py
, the class definition must be class Example(LycraCrawler)
.
The constructor must have one parameter for a LycraDatabase
database object.
The class diagram for a crawler:
A minimal crawler implementation looks like this:
from lib.crawlerapi import LycraCrawler from lib.db.lycradb import LycraDatabase class Example(LycraCrawler): def __init__(self, db): LycraCrawler.__init__(self, db, "Example", "1.0.0") def DoCrawl(self, artistname, albumname, songname, songid): return False
Lycra API Class¶
- class mdbapi.lycra.Lycra(config)[source]¶
This class does the main lyrics management.
- Parameters
config – MusicDB Configuration object.
- Raises
TypeError – when config is not of type
MusicDBConfig
- CrawlForLyrics(artistname, albumname, songname, songid)[source]¶
Loads all crawler from the crawler directory via
LoadCrawlers()
and runs them viaRunCrawler()
.- Parameters
artistname (str) – The name of the artist as stored in the music database
albumname (str) – The name of the album as stored in the music database
songname (str) – The name of the song as stored in the music database
songid (int) – The ID of the song to associate the lyrics with the song
- Returns
False
if something went wrong. OtherwiseTrue
. (This is no indication that there were lyrics found!)
- GetLyrics(songid)[source]¶
This method returns the lyrics of a song. See
lib.db.lycradb.LycraDatabase.GetLyricsFromCache()
- LoadCrawlers()[source]¶
This method loads all crawlers inside the crawler directory.
Warning
Changes at crawler may not be recognized until the whole application gets restarted. Only new added crawler gets loaded. Already loaded crawler are stuck at Pythons module cache.
- Returns
None
- RunCrawler(crawler, artistname, albumname, songname, songid)[source]¶
This method runs a specific crawler. This crawler gets all information available to search for a specific songs lyric.
This method is for class internal use. When using this class, call
CrawlForLyrics()
instead of calling this method directly. Before calling this method,LoadCrawlers()
must be called.The crawler base class
lib.crawlerapi.LycraCrawler
catches all exceptions so that they do not net to be executed in an try-except environment.- Parameters
crawler (str) – Name of the crawler. If it addresses the file
lib/crawler/example.py
the name isexample
artistname (str) – The name of the artist as stored in the MusicDatabase
albumname (str) – The name of the album as stored in the MusicDatabase
songname (str) – The name of the song as stored in the MusicDatabase
songid (int) – The ID of the song to associate the lyrics with the song
- Returns
None
Crawler Base Class¶
- class lib.crawlerapi.LycraCrawler(db, name, version)[source]¶
This is the base class for all crawler.
- Parameters
db – A
LycraDatabase
database object.name (str) – Name of the crawler. It should be the same name the class and file have.
version (str) – A version number in format major.minor.patchlevel as string. For example
"1.0.0"
- Raises
TypeError – If the db argument is not of type
LycraDatabase
TypeError – If name or version number are not of type
str
- Crawl(artistname, albumname, songname, songid)[source]¶
This method gets called by the lyrics manager
mdbapi.lycra.Lycra
. It provides a small environment to fit the crawler into MusicDBs infrastructure. It catches exceptions and measures the time the crawler needs to run.- Parameters
artistname (str) – The name of the artist as stored in the MusicDatabase
albumname (str) – The name of the album as stored in the MusicDatabase
songname (str) – The name of the song as stored in the MusicDatabase
songid (int) – The ID of the song to associate the lyrics with the song
- Returns
True
if the crawler found lyrics, otherwiseFalse
- DoCrawl(artistname, albumname, songname, songid)[source]¶
This is the prototype the derived class has to implement for crawling.
- Parameters
artistname (str) – The name of the artist as stored in the music database
albumname (str) – The name of the album as stored in the music database
songname (str) – The name of the song as stored in the music database
songid (int) – The ID of the song to associate the lyrics with the song
- Returns
True
if the crawler found lyrics, otherwiseFalse
Lycra Database¶
Lyrics cache entry:
id
crawler
songid
updatetime
url
lyrics
- crawler:
Name of the crawler
- updatetime:
Unix timestamp when this crawler entry was updated the last time
- url:
URL from that the lyrics were loaded
- lyrics:
The lyrics itself. This entry will be compressed using the
lib.db.database.Database.Compress()
method.
- class lib.db.lycradb.LycraDatabase(path)[source]¶
Derived from
lib.db.database.Database
.- Parameters
path (str) – Absolute path to the LyCra database file.
- Raises
ValueError – When the version of the database does not match the expected version. (Updating MusicDB may failed)
- GetLyricsFromCache(songid)[source]¶
This method returns a list of all entries from the cache, that matches the songid.
- Parameters
songid (int) – ID of a song
- Returns
A list of entries with lyrics, or
None
if nothing found.
- WriteLyricsToCache(crawler, songid, lyrics, url)[source]¶
This method writes the lyrics a crawler found into the database. If there is already an entry for the combination of songid and crawler, this entry gets updated.
The lyrics will be compressed.
- Parameters
crawler (str) – Name of the crawler that found the lyrics
songid (int) – ID of the song of that the lyrics are
lyrics (str) – The lyrics that shall be stored
url (str) – The source of the lyrics
- Returns
None