Preprocesado de imagen y OCR para mejorar deteccion de smishing

Pablo Blanco Medina; Andrés Carofilis; Eduardo Fidalgo; Enrique Alegre

doi:10.17979/ja-cea.2024.45.10955

Autores/as

Pablo Blanco Medina Universidad de León
Andrés Carofilis Universidad de León
Eduardo Fidalgo Universidad de León
Enrique Alegre Universidad de León

DOI:

https://doi.org/10.17979/ja-cea.2024.45.10955

Palabras clave:

Seguridad, Aprendizaje Profundo, Apoyo a Operadores Humanos, Redes Sociales, Sistemas de Control y Automatización para la Ayuda Internacional

Resumen

La globalización de las tecnologías de comunicación ha llevado a un aumento de las estafas mediante técnicas de phishing. Los Equipos de Respuesta ante Emergencias Informáticas (CERTs) reciben capturas de pantalla enviadas por usuarios cuyos smartphones reciben mensajes sospechosos. Estos SMS tratan de suplantar compañías conocidas para persuadir a sus usuarios de tomar acciones urgentes, robando sus datos o realizando acciones no autorizadas en sus cuentas bancarias. Estos mensajes se conocen como Smishing, y los CERTs están interesados en herramientas que permitan automatizar la extracción de URLs en capturas de pantalla para verificar si contienen phishing. En este trabajo, proponemos una estrategia de extracción de URLs de capturas de pantalla que combinan técnicas tradicionales de visión artificial, como preprocesado y operaciones morfológicas, con mecanismos de detección y reconocimiento de URL para recuperar las URLs sospechosas. Evaluando nuestra propuesta en 117 capturas de Smishing que contienen 121 URLs, logramos una precisión del 61.16% en la recuperación de URLs en mensajes Smishing.

Citas

Choudhary, N., Jain, A. K., 2018. Comparative analysis of mobile phishing detection and prevention approaches. In: Information and Communication Technology for Intelligent Systems (ICTIS 2017)-Volume 1 2. Springer, pp. 349–356. DOI: https://doi.org/10.1007/978-3-319-63673-3_43

Goel, D., Jain, A. K., 2018. Smishing-classifier: a novel framework for detection of smishing attack in mobile environment. In: Smart and Innovative Trends in Next Generation Computing Technologies: Third International Conference, NGCT 2017, Dehradun, India, October 30-31, 2017, Revised Selected Papers, Part II 3. Springer, pp. 502–512. DOI: https://doi.org/10.1007/978-981-10-8660-1_38

Jain, A. K., Yadav, S. K., Choudhary, N., 2020. A novel approach to detect spam and smishing sms using machine learning techniques. International Journal of E-Services and Mobile Applications (IJESMA) 12 (1), 21–38. DOI: https://doi.org/10.4018/IJESMA.2020010102

Jánez-Martino, F., Alaiz-Rodríguez, R., Gonzalez-Castro, V., Fidalgo, E., Alegre, E., 2023. Classifying spam emails using agglomerative hierarchical clustering and a topic-based approach. Applied Soft Computing 139, 110226. DOI: https://doi.org/10.1016/j.asoc.2023.110226

Jánez-Martino, F., Alaiz-Rodríguez, R., Gonzalez-Castro, V., Fidalgo, E., Alegre, E., 2023. A review of spam email detection: analysis of spammer strategies and the dataset shift problem. Artificial Intelligence Review 56 (2), 1145–1173. DOI: https://doi.org/10.1007/s10462-022-10195-4

Li, M., Lv, T., Chen, J., Cui, L., Lu, Y., Florencio, D., Zhang, C., Li, Z., Wei, F., 2023. Trocr: Transformer-based optical character recognition with pre-trained models. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. pp. 13094–13102. DOI: https://doi.org/10.1609/aaai.v37i11.26538

Mishra, S., Soni, D., 2023. Dsmishsms-a system to detect smishing sms. Neural Computing and Applications 35 (7), 4975–4992. DOI: https://doi.org/10.1007/s00521-021-06305-y

Rahman, M. L., Timko, D., Wali, H., Neupane, A., 2023. Users really do respond to smishing. In: Proceedings of the Thirteenth ACM Conference on Data and Application Security and Privacy. pp. 49–60. DOI: https://doi.org/10.1145/3577923.3583640

Sanchez-Paniagua, M., Fernández, E. F., Alegre, E., Al-Nabki, W., González-Castro, V., 2022. Phishing url detection: A real-case scenario through login urls. IEEE Access 10, 42949–42960. DOI: https://doi.org/10.1109/ACCESS.2022.3168681

Smith, R., 2007. An overview of the tesseract ocr engine. In: ICDAR ’07: Proceedings of the Ninth International Conference on Document Analysis and Recognition. IEEE Computer Society, Washington, DC, USA, pp. 629–633. DOI: https://doi.org/10.1109/ICDAR.2007.4376991

Timko, D., Rahman, M. L., 2023. Commercial anti-smishing tools and their comparative effectiveness against modern threats. In: Proceedings of the 16th ACM Conference on Security and Privacy in Wireless and Mobile Networks. pp. 1–12. DOI: https://doi.org/10.1145/3558482.3590173

Uddin, M. S., Sultana, M., Rahman, T., Busra, U. S., 2012. Extraction of texts from a scene image using morphology based approach. In: 2012 International Conference on Informatics, Electronics & Vision (ICIEV). IEEE, pp. 876–880.

Wang, Y., Liu, Y., Wu, T., Duncan, I., 2020. A cost-effective ocr implementation to prevent phishing on mobile platforms. In: 2020 International Conference on Cyber Securit DOI: https://doi.org/10.1109/CyberSecurity49315.2020.9138873

Preprocesado de imagen y OCR para mejorar deteccion de smishing

Autores/as

DOI:

Palabras clave:

Resumen

Citas

Descargas

Publicado

Número

Sección

Licencia

Enviar un artículo

Número actual