Design of curricular evaluation instruments assisted by generative artificial intelligence
Keywords:
artificial intelligence, CIPP model, curriculum evaluation, measurement instruments, program evaluationAbstract
The use of generative artificial intelligence is having an increasingly significant impact on everyday life, ranging from simple tasks to highly complex processes. In this context, it is establishing itself as a valuable support tool for the design and validation of educational assessment instruments. The present study aims to design curriculum assessment instruments by applying the CIPP model, Messick’s modern psychometric-based instrument design theory, and the principles of prompting engineering, using generative artificial intelligence as a methodological support tool. The research is based on the Theory-Based Design and Development methodology. The results are presented in the form of tangible products, including: (a) a matrix of dimensions, criteria, and indicators derived from the CIPP model; (b) a definition of the assessment construct; and (c) the component-based prompting framework, which consists of three prompts. In conclusion, it is shown that the integration of the CIPP model, psychometric theory, and prompting engineering allows for the design of structured prompts that guide generative artificial intelligence to generate valid and coherent items, strengthening the quality and relevance of the design of curriculum assessment instruments
Downloads
References
AERA, APA, & NCME. (2014). Standards for educational and psychological testing. American Educational Research Association.
Ahmady, S., Kohan, N., Mirmoghtadaie, Z. S., Hamidi, H., Divshali, B. S., & Rakhshani, T. (2023). Designing and psychometric analysis of an instrument to assess learning process in a virtual environment. Smart Learning Environments, 10, 35. https://doi.org/10.1186/s40561-023-00254-w
Aránguiz, K. (2020). Propuesta de modelo de evaluación curricular para carreras y programas de pregrado de la Universidad Católica de la Santísima Concepción [Tesis de maestría, Universidad del Desarrollo]. Repositorio UDD. https://repositorio.udd.cl/items/34ddca7d-60c8-46df-ac23-f50ae8d6e209
Arias, M. R. M., Lloreda, M. J. H., y Lloreda, M. V. H. (2014). Psicometría. Alianza Editorial.
Aziz, S., et al. (2018). Implementation of CIPP model for quality evaluation at school level: A case study. Journal of Education and Educational Development, 5(1), 189–206. https://files.eric.ed.gov/fulltext/EJ1180614.pdf
Carrillo-Cayllahua, J., Cóndor-Salvatierra, E., Oré-Rojas, J., y Gonzales-Castro, A. (2023). Evaluación curricular de una carrera profesional en educación superior universitaria. Inudi Perú. https://doi.org/10.35622/inudi.b.123
Cely, M., y Quiñones, A. (2022). Revisión sistemática de las características de evaluación curricular en programas académicos de pregrado a través del método PRISMA-NMA. Revista Electrónica Calidad en la Educación Superior, 13(2), 150–174. https://doi.org/10.22458/caes.v13i2.4415
Chamorro, D., y Borjas, M. (2020). Investigación evaluativa curricular: Un camino a la transformación del aula. Universidad del Norte. https://manglar.uninorte.edu.co/bitstream/handle/10584/9252/9789587892185%20eInvestigacion%20evaluativa%20curricular.pdf
Contreras, M. (2021). Modelo integral para la evaluación curricular: De las variables a los instrumentos. UNAM. https://www.zaragoza.unam.mx/wp-content/Portal2015/publicaciones/libros/csociales/Modelo_integral.pdf
Flores Contrera, C. J. (2024). La evaluación educativa en la era de la inteligencia artificial: Cambios de paradigmas. LATAM Revista Latinoamericana de Ciencias Sociales y Humanidades, 5(1), 1579–1591. https://doi.org/10.56712/latam.v5i1.1694
Gregor, S., y Hevner, A. R. (2013). Positioning and presenting design science research for maximum impact. MIS Quarterly, 37(2), 337–356.
Hernández-León, N., y Rodríguez-Conde, M. J. (2023). Inteligencia artificial aplicada a la educación y la evaluación educativa en la universidad. Revista de Educación a Distancia (RED), 23(71). https://doi.org/10.6018/red
Inciarte, A., y Canquiz, L. (2001). Análisis de la consistencia interna del currículo. Informe de Investigaciones Educativas, 15(1–2), 79–90. https://www.researchgate.net/publication/237265696_Analisis_de_la_consistencia_interna_del_curriculo
Karatas, H., y Fer, S. (2011). CIPP evaluation model scale: Development, reliability and validity. Procedia - Social and Behavioral Sciences, 15, 592–599. https://doi.org/10.1016/j.sbspro.2011.03.146
Korzynski, P., Mazurek, G., Krzypkowska, P., y Kurasinski, A. (2023). Artificial intelligence prompt engineering as a new digital competence: Analysis of generative AI technologies such as ChatGPT. Entrepreneurial Business and Economics Review, 11(3), 25–37. https://doi.org/10.15678/EBER.2023.110302
Lee, D., y Palmer, E. (2025). Prompt engineering in higher education: A systematic review to help inform curricula. International Journal of Educational Technology in Higher Education, 22, 7. https://doi.org/10.1186/s41239-025-00503-7
Lo, L. S. (2023). The CLEAR path: A framework for enhancing information literacy through prompt engineering. The Journal of Academic Librarianship, 49, 102720. https://doi.org/10.1016/j.acalib.2023.102720
López, R., Valdez, A., y Martínez, J. (2020). Validez de constructo y confiabilidad de un instrumento para evaluar la docencia en educación superior. Revista de Evaluación Educativa, 12(3), 45–63.
Maaz, S., Palaganas, J. C., Palaganas, G., y Bajwa, M. (2025). A guide to prompt design: Foundations and applications for healthcare simulationists. Frontiers in Medicine, 11, 1504532. https://doi.org/10.3389/fmed.2024.1504532
Méndez-Mantuano, M. O., Morán, M. Y. O., Mayorga, I. I. C., Valdez, A. Y. L., Rosado, Á. R. H., y Robles, D. V. A. (2024). La evaluación académica en la era de la inteligencia artificial (IA). South Florida Journal of Development, 5(1), 119–148. https://doi.org/10.46932/sfjdv5n1-010
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). American Council on Education & Macmillan.
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749. https://doi.org/10.1037/0003-066X.50.9.741
Mora, A. (2004). La evaluación educativa: Concepto, períodos y modelos. Actualidades Investigativas en Educación, 4(2). https://www.redalyc.org/pdf/447/44740211.pdf
Pizano, G. (2014). Modelos de evaluación curricular. Revista Investigación Educativa, 4(6), 15–22. https://revistasinvestigacion.unmsm.edu.pe/index.php/educa/article/view/7640/6649
Ramos-Armijos, D. F., Ramos-Armijos, D. G., Tapia-Puga, V. M., y Tapia-Puga, L. I. (2023). Explorando las fronteras: La aplicación de inteligencia artificial en la evaluación educativa. Ciencia Latina Revista Científica Multidisciplinar, 7(6), 5657–5667. https://doi.org/10.37811/cl_rcm.v7i6.9108
Reeves, T. C. (2006). Design research from a technology perspective. In J. van den Akker, K. Gravemeijer, S. McKenney, y N. Nieveen (Eds.), An introduction to educational design research (pp. 52–66). Netherlands Institute for Curriculum Development (SLO).
Schulhoff, S., Ilie, M., Balepur, N., et al. (2024). The prompt report: A systematic survey of prompting techniques. University of Maryland. https://arxiv.org/abs/2406.06608
Stufflebeam, D. L., y Shinkfield, A. J. (2007). Evaluation theory, models, and applications. Jossey-Bass.
Taureaux Díaz, N., Miralles Aguilera, E., Vicedo Tomey, A., y Díaz-Perera, G. (2016). Instrumento para la evaluación curricular de la función de administración en la carrera de Medicina. Revista Habanera de Ciencias Médicas, 15(5), 769–781.
Velásquez-Henao, J. D., Franco-Cardona, C. J., y Cadavid-Higuita, L. (2023). Prompt engineering: A methodology for optimizing interactions with AI-language models in the field of engineering. Revista DYNA, 90(230), 9–17. https://doi.org/10.15446/dyna.v90n230.111700
White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., y Schmidt, D. C. (2023). A prompt pattern catalog to enhance prompt engineering with ChatGPT. Vanderbilt University. https://arxiv.org/abs/2302.11382
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors who publish in Revista La Universidad agree to the following terms: Authors continue as owners of their works, non-exclusively assigning dissemination rights to La Universidad Journal under the standards of the Attribution-NonCommercial-ShareAlike License: CC BY-NC-SA 4.0. This license allows the use of a work to create another work or content, modifying or not the original work, as long as the author is cited, the resulting work is shared under the same type of license and has no commercial purposes(https://creativecommons.org/licenses/by-nc-sa/4.0/deed.es).

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

