Cybercrime Prevention Methodology Using Web Scraping & Natural Language Processing to Detect Data Leak on the Dark Web (actualmente bajo revisión por pares)
This work introduces a methodology based on capturing and processing leaked data that have been put up for sale in the black market of the Dark Web, an area of the Internet with a high prevalence of criminal content. One of the main challenges in the collection and processing of critical data sold on the Dark Web is the volatility of the content and its lack of structure. It is for this reason that a methodology based on Web Scraping and Natural Language Processing techniques is proposed for the detection of sensitive data published in the black market of the Internet and to prevent its use in cases of extortion, identity theft, disclosure of secrets and other types of cybercrimes.
Journal Impact: Citation (formatted-apa):