CRIMES DE ÓDIO NA INTERNET: UM COMPARATIVO DE MODELOS DE APRENDIZADO DE MÁQUINA CLÁSSICOS PARA IDENTIFICAÇÃO DE DISCURSO NOCIVO

Morais, Arthur

Use este identificador para citar ou linkar para este item: https://repositorio.ifgoiano.edu.br/handle/prefix/5883

Registro completo de metadados

Campo DC	Valor	Idioma
dc.contributor.advisor1	Silva, Leila	-
dc.contributor.advisor1Lattes	http://lattes.cnpq.br/1190705935250092	pt_BR
dc.creator	Morais, Arthur	-
dc.creator.Lattes	http://lattes.cnpq.br/7011988549412223	pt_BR
dc.date.accessioned	2025-11-11T13:54:34Z	-
dc.date.available	2025-11-05	-
dc.date.available	2025-11-11T13:54:34Z	-
dc.date.issued	2025-09-01	-
dc.identifier.uri	https://repositorio.ifgoiano.edu.br/handle/prefix/5883	-
dc.description.abstract	In the contemporary world, technology plays a fundamental role in life in society, one of its main functions is to facilitate communication between individuals. Social networks, as the name itself says, function as a large virtual community, people in general have never been so exposed to the comments of third parties, nor has there ever been such ease in making comments about others, many even do so anonymously, which intensifies a sense of impunity regarding what is said, a fact that contributes when the intention of the comment is only to express hate. Hate crimes can be destructive in the lives of victims and must be treated with seriousness by governmental and private organizations that manage social networks. Aggressors benefit from the variation in national legislation on hate speech, from the difficulty of establishing a boundary for the constantly evolving cyberspace, and from algorithms that provide greater engagement to controversial comments. The spread of online hate speech challenges policymakers and the research community worldwide. To assist in combating these crimes, several techniques of natural language processing, data analysis, and threat detection have already been tested and presented through scientific texts which served as inspiration for the creation of this work. In this monograph, data mining techniques and supervised machine learning models were implemented to train models capable of identifying these crimes in diverse texts. To test them, a dataset generated dynamically by the Dynabench project from the context of common hate crimes that occur on the internet was used. Each technique tested is compared in order to document the limitations, best models, and strategies for detecting this type of crime.	pt_BR
dc.description.resumo	No mundo contemporâneo, a tecnologia exerce papel fundamental na vida em sociedade, uma de suas principais funções é a de facilitar a comunicação entre indivíduos. As redes sociais, como o próprio nome diz, funcionam como uma grande comunidade virtual, as pessoas em geral nunca estiveram tão expostas ao comentário de terceiros, também nunca houve tanta facilidade em tecer comentários sobre outros, muitos inclusive o fazem de forma anônima, o que intensifica uma sensação de impunidade sobre o que é dito, fato este que colabora quando o intuito do comentário é apenas expressar ódio. Os crimes de ódio podem ser destrutivos na vida das vítimas e devem ser tratados com seriedade pelas organizações governamentais e privadas que gerem as redes sociais. Agressores se beneficiam da variação nas legislações nacionais sobre discurso de ódio, da dificuldade de estabelecer um limite para o ciberespaço em constante evolução e de algoritmos que fornecem maior engajamento a comentários polêmicos. A propagação do discurso de ódio online desafia os formuladores de políticas e a comunidade de pesquisa no mundo todo. Para auxiliar no combate a estes crimes, diversas técnicas de processamento de linguagem natural, análise de dados e detecção de ameaças já foram testadas e apresentadas por meio de textos científicos dos quais serviram de inspiração para criação deste. Nesta monografia, foram implementadas técnicas de mineração de dados e modelos supervisionados de aprendizado de máquina para treinar modelos capazes de identificar estes crimes em textos diversos. Para testá-las, utilizou-se de uma base de dados gerada dinamicamente pelo projeto Dynabench a partir do contexto de crimes de ódio comuns que ocorrem na internet. Cada técnica testada é comparada a fim de documentar as limitações, melhores modelos e estratégias para detecção deste tipo de crime.	pt_BR
dc.description.provenance	Submitted by Arthur Fernandes Miranda Borges de Morais (arthur.miranda@estudante.ifgoiano.edu.br) on 2025-11-04T14:23:41Z No. of bitstreams: 4 tcc_ArthurMiranda_Publicar (1).pdf: 1739442 bytes, checksum: 8d6c0fd591d7e05e6f18a08f1e262326 (MD5) ata.pdf: 312144 bytes, checksum: 1368689555e633a0b9040e47c7801a2c (MD5) termo.pdf: 158580 bytes, checksum: ec722fb21f4ec8c209578892058caba0 (MD5) ficha-catalográfica.pdf: 64365 bytes, checksum: 4b1e7f2163e9605af4f0ec891ecfd2d5 (MD5)	en
dc.description.provenance	Rejected by Hevellin Estrela (hevellin.estrela@ifgoiano.edu.br), reason: Prezado ARTHUR, Informamos que sua submissão foi rejeitada para ajustes pelo seguinte motivo: -- O arquivo tem que estar em arquivo único e em formato .pdf. e os documentos assinados pela banca e orientação. Os documentos assinados estão separados. É preciso que você faça a substituição dos documentos, coloque em arquivo único e faça nova submissão. O(s) autor(es) devem revisar a versão final do trabalho acadêmico e gerar arquivo em formato PDF dessa versão, com as devidas comprovações solicitadas de aprovação contendo, em um único arquivo, as páginas na seguinte ordem:  1º Capa;  2º Folha de rosto;  3º TCAE;  4º Ata de defesa;  5º Trabalho defendido. Aguardamos a devolução do mesmo com as alterações solicitadas. Estamos à disposição. Atenciosamente, --- on 2025-11-04T17:23:26Z (GMT)	en
dc.description.provenance	Submitted by Arthur Fernandes Miranda Borges de Morais (arthur.miranda@estudante.ifgoiano.edu.br) on 2025-11-10T11:34:26Z No. of bitstreams: 1 TCC_Arthur_Miranda_Oficial.pdf: 1860803 bytes, checksum: 6da8d1f0bc8b220e8f3930bda67eb2ae (MD5)	en
dc.description.provenance	Approved for entry into archive by Itala Moreira Alves (itala.moreira@ifgoiano.edu.br) on 2025-11-11T13:33:07Z (GMT) No. of bitstreams: 1 TCC_Arthur_Miranda_Oficial.pdf: 1860803 bytes, checksum: 6da8d1f0bc8b220e8f3930bda67eb2ae (MD5)	en
dc.description.provenance	Approved for entry into archive by Itala Moreira Alves (itala.moreira@ifgoiano.edu.br) on 2025-11-11T13:54:34Z (GMT) No. of bitstreams: 1 TCC_Arthur_Miranda_Oficial.pdf: 1860803 bytes, checksum: 6da8d1f0bc8b220e8f3930bda67eb2ae (MD5)	en
dc.description.provenance	Made available in DSpace on 2025-11-11T13:54:34Z (GMT). No. of bitstreams: 1 TCC_Arthur_Miranda_Oficial.pdf: 1860803 bytes, checksum: 6da8d1f0bc8b220e8f3930bda67eb2ae (MD5) Previous issue date: 2025-09-01	en
dc.language	por	pt_BR
dc.publisher	Instituto Federal Goiano	pt_BR
dc.publisher.country	Brasil	pt_BR
dc.publisher.department	Campus Morrinhos	pt_BR
dc.publisher.initials	IF Goiano	pt_BR
dc.relation.references	Albladi, Aish, et al. Hate Speech Detection using Large Language Models: A Comprehensive Review. IEEE Access (2025). AMAZON AWS. What is Python? Disponível em: <https://aws.amazon.com/pt/what-is/python/>. Acesso em: 1 jun. 2025. Bird, Steven. NLTK: the natural language toolkit. Proceedings of the COLING/ACL 2006 interactive presentation sessions. 2006. BRASIL. Constituição (1988). Constituição da República Federativa do Brasil de 1988. Brasília, DF: Presidência da República,. Disponível em: http://www.planalto.gov.br/ccivil_03/Constituicao/ Constituiçao. htm. BRASIL. Decreto-Lei nº 14.811, de 12 de janeiro de 2024. Código Penal. Disponível em: <https://www.planalto.gov.br/ccivil_03/Decreto-Lei/Del2848.htm>. Acesso em: 1 jun. 2025. Breiman, Leo. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science 16.3 (2001): 199-231. CANADIAN HERITAGE. Backgrounder – Government of Canada introduces legislation to combat harmful content online, including the sexual exploitation of children. Disponível em: <https://www.canada.ca/en/canadian-heritage/news/2024/02/backgrounder--governm ent-of-canada-introduces-legislation-to-combat-harmful-content-online-including-thesexual-exploitation-of-children.html>. Acesso em: 1 jun. 2025. Chowdhary, KR1442, and K. R. Chowdhary. Natural language processing. Fundamentals of artificial intelligence (2020): 603-649. GOOGLE DEVELOPERS. Classification: Accuracy, recall, precision, and related metrics. Disponível em: <https://developers.google.com/machine-learning/crash-course/classification/accurac y-precision-recall>. Acesso em: 1 jun. 2025. 57 Géron, Aurélien. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O'Reilly Media, Inc.", 2022. Harris, Jenine K. Primer on binary logistic regression. Family medicine and community health 9.Suppl 1 (2021): e001290. Hilbe, Joseph M. Logistic regression. International encyclopedia of statistical science 1 (2011): 15-32. Hopcroft, John E., Jeffrey D. Ullman, and Alfred Vaino Aho. Data structures and algorithms. Vol. 175. Boston, MA, USA:: Addison-wesley, 1983. Jahan, Md Saroar, and Mourad Oussalah. A systematic review of hate speech automatic detection using natural language processing. Neurocomputing 546 (2023): 126232. Jordan, Michael I., and Tom M. Mitchell. Machine learning: Trends, perspectives, and prospects. Science 349.6245 (2015): 255-260. Joy, Jerry, et al. Speech emotion recognition using neural network and MLP classifier. Ijesc 2020 (2020): 25170-25172. Liddy, Elizabeth D. Natural language processing. (2001). Lutz, Mark. Programming python. " O'Reilly Media, Inc.", 2010. Mahesh, Batta. "Machine learning algorithms-a review." International Journal of Science and Research (IJSR).[Internet] 9.1 (2020): 381-386. MATPLOTLIB. Matplotlib — visualization with python. Disponível em: <https://matplotlib.org/>. Acesso em: 1 jun. 2025. Mondal, Mainack, Leandro Araújo Silva, and Fabrício Benevenuto. A measurement study of hate speech in social media. Proceedings of the 28th ACM conference on hypertext and social media. 2017. 58 Mullah, Nanlir Sallau, and Wan Mohd Nazmee Wan Zainon. Advances in machine learning algorithms for hate speech detection in social media: a review. IEEE Access 9 (2021): 88364-88376. O GLOBO. X pode ser bloqueado? Saiba o passo a passo seguido por autoridades após decisão judicial. Disponível em: <https://oglobo.globo.com/politica/noticia/2024/08/29/x-pode-ser-bloqueado-saiba-opasso-a-passo-seguido-por-autoridades-apos-decisao-judicial.ghtml>. Acesso em: 1 jun. 2025. PANDAS. Pandas - Python Data Analysis Library. Disponível em: <https://pandas.pydata.org/about/>. Acesso em: 1 jun. 2025. Popescu, Marius-Constantin, et al. Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems 8.7 (2009): 579-588. Priyadharshini, G. Detection of Hate Speech using Text Mining and Natural Language Processing. International Journal of Engineering Research & Technology (IJERT) 9.11 (2020): 2018-2021. Quadrado, J. C., & Ferreira, E. da S. Ódio e intolerância nas redes sociais digitais. Revista Katálysis, 23(3), 419–428. (2020). SAFERNET. Indicadores da Central Nacional de Denúncias de Crimes Cibernéticos. Disponível em: <https://indicadores.safernet.org.br/index.html>. Acesso em: 02 agos. 2025. Song, Yan-Yan, and L. U. Ying. Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry 27.2 (2015): 130. STANFORD. Speech and Language Processing. Disponível em: <https://web.stanford.edu/~jurafsky/slp3/>. Acesso em: 19 set. 2025. Sudhaka, Kalyan. Python vs. R Programming Language. International Journal of Management, IT and Engineering 8.8 (2018): 70-79. TENSOR GIRL. Dynamically generated hate speech dataset. , 2 jan. 2021. Disponível em: 59 <https://www.kaggle.com/datasets/usharengaraju/dynamically-generated-hate-speec h-dataset/code> . Acesso em: 1 jun. 2025 Valkenborg, Dirk, et al. Support vector machines. American Journal of Orthodontics and Dentofacial Orthopedics 164.5 (2023): 754-757. Wythoff, Barry J. Backpropagation neural networks: a tutorial. Chemometrics and Intelligent Laboratory Systems 18.2 (1993): 115-155. Yahav, Inbal, Onn Shehory, and David Schwartz. Comments mining with TF-IDF: the inherent bias and its removal. IEEE Transactions on Knowledge and Data Engineering 31.3 (2018): 437-450.	pt_BR
dc.rights	Acesso Aberto	pt_BR
dc.subject	discurso de ódio	pt_BR
dc.subject	ética	pt_BR
dc.subject	redes sociais	pt_BR
dc.subject	aprendizado de máquina supervisionado	pt_BR
dc.subject	processamento de linguagem natural	pt_BR
dc.subject	hate speech	pt_BR
dc.subject	natural language processing	pt_BR
dc.subject	social networks	pt_BR
dc.subject	ethics	pt_BR
dc.subject	supervised machine learning	pt_BR
dc.subject.cnpq	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::MATEMATICA DA COMPUTACAO::MODELOS ANALITICOS E DE SIMULACAO	pt_BR
dc.title	CRIMES DE ÓDIO NA INTERNET: UM COMPARATIVO DE MODELOS DE APRENDIZADO DE MÁQUINA CLÁSSICOS PARA IDENTIFICAÇÃO DE DISCURSO NOCIVO	pt_BR
dc.type	Trabalho de Conclusão de Curso	pt_BR
Aparece nas coleções:	Bacharelado em Ciência da Computação

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
TCC_Arthur_Miranda_Oficial.pdf		1,82 MB	Adobe PDF	Visualizar/Abrir

Mostrar registro simples do item Recomendar este item Visualizar estatísticas