Architecture

Domain

Dama as a tool uses historical Filipino and Filipino English tweets and Reddit posts as data. Naturally, these posts come from Filipinos and may contain the sentiment of the user who posted the document—using the sentiment and emotions of Filipinos as data would make the tool revolve around the context of Filipinos. With this in mind, the tool was designed to be used primarily by the general Filipino public, secondarily Philippine government entities, and lastly Philippine-based organizations. The tool would help these target users monitor the sentiment changes in the Philippines and use the data being visualized by the tool to create decisions based on how the public feels for a particular point in time.

Corpus

The corpus of these web application contains historical Filipino and Filipino English tweets and Reddit posts as data. It has been retrieved mainly through the COHFIE API by Banogon et al. wherein they had already retrieved Filipino centered Reddit posts and Twitter. Although the COHFIE API was used to retrieve the data, we also scraped Reddit posts to complete the dataset. The dataset of this application comes from January 2019 to Decemeber 2022 to account different scenarios. Filipino tweets were collected through the use of Filipino stop words (e.g. "saan", "nito", "dapat", and "din"). While English tweets were collected from well-known users in the Philippines in which uses English as their primary language. Reddit posts on the other hand was collected from different well-known subreddit found on the table below:


3DSPHKwadernoTomasinopeyups
ADMULawStudentsPHartphphclassifieds
AkoBaYungGagoLoLPHSubredditbeautytalkphphinvest
BPOinPHNintendoPHcagayandeorophlgbt
CasualPHOldSchoolPHdavaophr4r
CebuPBAdlsupinoyent
Coronavirus PHPHBookClubdostscholarsstudentsph
FilipinoFreetdinkers PHPHGamersexIglesiaNiCristo
FilipinoHistoryPHikingAndBackpackingfilipinofood
FilipinologyPampamilyangPaoLULilustrado
FilmClubPHPhilippinesindiemusicph
GulongPilipinasmedschoolph
IloiloRedditPHCyclingClubmnl
KakaiBalitaTagalogopm
KwadernoTianggepalawan

Since we also stated that we collected Reddit posts on our own we followed what the COHFIE API did to not mess up the dataset.