Design of curricular evaluation instruments assisted by generative artificial intelligence

Authors

Keywords:

artificial intelligence, CIPP model, curriculum evaluation, measurement instruments, program evaluation

Abstract

The use of generative artificial intelligence is having an increasingly significant impact on everyday life, ranging from simple tasks to highly complex processes. In this context, it is establishing itself as a valuable support tool for the design and validation of educational assessment instruments. The present study aims to design curriculum assessment instruments by applying the CIPP model, Messick’s modern psychometric-based instrument design theory, and the principles of prompting engineering, using generative artificial intelligence as a methodological support tool. The research is based on the Theory-Based Design and Development methodology. The results are presented in the form of tangible products, including: (a) a matrix of dimensions, criteria, and indicators derived from the CIPP model; (b) a definition of the assessment construct; and (c) the component-based prompting framework, which consists of three prompts. In conclusion, it is shown that the integration of the CIPP model, psychometric theory, and prompting engineering allows for the design of structured prompts that guide generative artificial intelligence to generate valid and coherent items, strengthening the quality and relevance of the design of curriculum assessment instruments

Downloads

Download data is not yet available.

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. American Educational Research Association.

Ahmady, S., Kohan, N., Mirmoghtadaie, Z. S., Hamidi, H., Divshali, B. S., & Rakhshani, T. (2023). Designing and psychometric analysis of an instrument to assess learning process in a virtual environment. Smart Learning Environments, 10, 35. https://doi.org/10.1186/s40561-023-00254-w

Aránguiz, K. (2020). Propuesta de modelo de evaluación curricular para carreras y programas de pregrado de la Universidad Católica de la Santísima Concepción [Tesis de maestría, Universidad del Desarrollo]. Repositorio UDD. https://repositorio.udd.cl/items/34ddca7d-60c8-46df-ac23-f50ae8d6e209

Arias, M. R. M., Lloreda, M. J. H., y Lloreda, M. V. H. (2014). Psicometría. Alianza Editorial.

Aziz, S., et al. (2018). Implementation of CIPP model for quality evaluation at school level: A case study. Journal of Education and Educational Development, 5(1), 189–206. https://files.eric.ed.gov/fulltext/EJ1180614.pdf

Carrillo-Cayllahua, J., Cóndor-Salvatierra, E., Oré-Rojas, J., y Gonzales-Castro, A. (2023). Evaluación curricular de una carrera profesional en educación superior universitaria. Inudi Perú. https://doi.org/10.35622/inudi.b.123

Cely, M., y Quiñones, A. (2022). Revisión sistemática de las características de evaluación curricular en programas académicos de pregrado a través del método PRISMA-NMA. Revista Electrónica Calidad en la Educación Superior, 13(2), 150–174. https://doi.org/10.22458/caes.v13i2.4415

Chamorro, D., y Borjas, M. (2020). Investigación evaluativa curricular: Un camino a la transformación del aula. Universidad del Norte. https://manglar.uninorte.edu.co/bitstream/handle/10584/9252/9789587892185%20eInvestigacion%20evaluativa%20curricular.pdf

Contreras, M. (2021). Modelo integral para la evaluación curricular: De las variables a los instrumentos. UNAM. https://www.zaragoza.unam.mx/wp-content/Portal2015/publicaciones/libros/csociales/Modelo_integral.pdf

Flores Contrera, C. J. (2024). La evaluación educativa en la era de la inteligencia artificial: Cambios de paradigmas. LATAM Revista Latinoamericana de Ciencias Sociales y Humanidades, 5(1), 1579–1591. https://doi.org/10.56712/latam.v5i1.1694

Gregor, S., y Hevner, A. R. (2013). Positioning and presenting design science research for maximum impact. MIS Quarterly, 37(2), 337–356.

Hernández-León, N., y Rodríguez-Conde, M. J. (2023). Inteligencia artificial aplicada a la educación y la evaluación educativa en la universidad. Revista de Educación a Distancia (RED), 23(71). https://doi.org/10.6018/red

Inciarte, A., y Canquiz, L. (2001). Análisis de la consistencia interna del currículo. Informe de Investigaciones Educativas, 15(1–2), 79–90. https://www.researchgate.net/publication/237265696_Analisis_de_la_consistencia_interna_del_curriculo

Karatas, H., y Fer, S. (2011). CIPP evaluation model scale: Development, reliability and validity. Procedia - Social and Behavioral Sciences, 15, 592–599. https://doi.org/10.1016/j.sbspro.2011.03.146

Korzynski, P., Mazurek, G., Krzypkowska, P., y Kurasinski, A. (2023). Artificial intelligence prompt engineering as a new digital competence: Analysis of generative AI technologies such as ChatGPT. Entrepreneurial Business and Economics Review, 11(3), 25–37. https://doi.org/10.15678/EBER.2023.110302

Lee, D., y Palmer, E. (2025). Prompt engineering in higher education: A systematic review to help inform curricula. International Journal of Educational Technology in Higher Education, 22, 7. https://doi.org/10.1186/s41239-025-00503-7

Lo, L. S. (2023). The CLEAR path: A framework for enhancing information literacy through prompt engineering. The Journal of Academic Librarianship, 49, 102720. https://doi.org/10.1016/j.acalib.2023.102720

López, R., Valdez, A., y Martínez, J. (2020). Validez de constructo y confiabilidad de un instrumento para evaluar la docencia en educación superior. Revista de Evaluación Educativa, 12(3), 45–63.

Maaz, S., Palaganas, J. C., Palaganas, G., y Bajwa, M. (2025). A guide to prompt design: Foundations and applications for healthcare simulationists. Frontiers in Medicine, 11, 1504532. https://doi.org/10.3389/fmed.2024.1504532

Méndez-Mantuano, M. O., Morán, M. Y. O., Mayorga, I. I. C., Valdez, A. Y. L., Rosado, Á. R. H., y Robles, D. V. A. (2024). La evaluación académica en la era de la inteligencia artificial (IA). South Florida Journal of Development, 5(1), 119–148. https://doi.org/10.46932/sfjdv5n1-010

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). American Council on Education & Macmillan.

Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749. https://doi.org/10.1037/0003-066X.50.9.741

Mora, A. (2004). La evaluación educativa: Concepto, períodos y modelos. Actualidades Investigativas en Educación, 4(2). https://www.redalyc.org/pdf/447/44740211.pdf

Pizano, G. (2014). Modelos de evaluación curricular. Revista Investigación Educativa, 4(6), 15–22. https://revistasinvestigacion.unmsm.edu.pe/index.php/educa/article/view/7640/6649

Ramos-Armijos, D. F., Ramos-Armijos, D. G., Tapia-Puga, V. M., y Tapia-Puga, L. I. (2023). Explorando las fronteras: La aplicación de inteligencia artificial en la evaluación educativa. Ciencia Latina Revista Científica Multidisciplinar, 7(6), 5657–5667. https://doi.org/10.37811/cl_rcm.v7i6.9108

Reeves, T. C. (2006). Design research from a technology perspective. In J. van den Akker, K. Gravemeijer, S. McKenney, y N. Nieveen (Eds.), An introduction to educational design research (pp. 52–66). Netherlands Institute for Curriculum Development (SLO).

Schulhoff, S., Ilie, M., Balepur, N., et al. (2024). The prompt report: A systematic survey of prompting techniques. University of Maryland. https://arxiv.org/abs/2406.06608

Stufflebeam, D. L., y Shinkfield, A. J. (2007). Evaluation theory, models, and applications. Jossey-Bass.

Taureaux Díaz, N., Miralles Aguilera, E., Vicedo Tomey, A., y Díaz-Perera, G. (2016). Instrumento para la evaluación curricular de la función de administración en la carrera de Medicina. Revista Habanera de Ciencias Médicas, 15(5), 769–781.

Velásquez-Henao, J. D., Franco-Cardona, C. J., y Cadavid-Higuita, L. (2023). Prompt engineering: A methodology for optimizing interactions with AI-language models in the field of engineering. Revista DYNA, 90(230), 9–17. https://doi.org/10.15446/dyna.v90n230.111700

White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., y Schmidt, D. C. (2023). A prompt pattern catalog to enhance prompt engineering with ChatGPT. Vanderbilt University. https://arxiv.org/abs/2302.11382

Published

2026-04-01

How to Cite

Design of curricular evaluation instruments assisted by generative artificial intelligence. (2026). La Universidad, 7(2), 11-37. https://revistas.ues.edu.sv/index.php/launiversidad/article/view/3886