NEYROFOTOSESSIYADA MOBILOGRAFIYA VA KAMERA HARAKATLARI INTEGRATSIYASI: PROMPT MUHANDISLIGI ORQALI VIZUAL REALIZMNI OSHIRISH
Nashr qilingan 2026-05-22
Kalit so‘zlar
- neyrofotosessiya,
- mobilografiya,
- prompt muhandisligi,
- kamera rakurslari,
- kamera harakatlari
- diffusion modellar,
- vizual realizm ...Ko'proq
Iqtibos keltirish uchun
Izoh
Ushbu maqolada neyrofotosessiya jarayonida mobilografiya, kamera rakurslari va kamera harakatlarini sun’iy intellekt (AI) tizimlariga prompt muhandisligi orqali integratsiya qilish masalasi tahlil qilinadi. Zamonaviy text-to-image diffusion modellar vizual kontent yaratishda yuqori imkoniyatlarga ega bo‘lsa-da, ularning chiqish sifati kiritilgan promptning aniqligi va strukturaviyligiga bog‘liq. Maqolada kamera rakurslari (eye-level, low angle, high angle va boshqalar) hamda kamera harakatlari (pan, tilt, dolly, handheld) mobilografik bilimlar bilan birgalikda strukturaviy prompt shaklida formalizatsiya qilindi. Natijalar shuni ko‘rsatadiki, bunday yondashuv tasvirning realizmi, chuqurligi va kinematografik sifatini sezilarli darajada oshiradi.
Bibliografik manbalar
- Aitken, A. P., Ledig, C., Theis, L., Caballero, J., Wang, Z., & Shi, W. (2017). Checkerboard artifact free sub-pixel convolution: A note on sub-pixel convolution, resize convolution and convolution resize. arXiv preprint arXiv:1707.02937 https://doi.org/10.48550/.
- Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., et al. (2021). On the opportunities and risks of foundation models. arXiv. https://arxiv.org/abs/2108.07258
- Bordwell, D., & Thompson, K. (2019). Film art: An introduction (12th ed.). McGraw-Hill Education.
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems, 27. pp. 2672–2680. Retrieved May 07, 2026 from https://papers.nips.cc/paper/5423-generative-adversarial-nets
- Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, 33. pp. 6840–6851). Retrieved May 07, 2026 from https://proceedings.neurips.cc/paper/2020/hash/4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html
- Herrera, L., Schaefer, K. L., Benjamin, L. S. S., & Henderson, J. A. (2023). Flash On: Capturing Minoritized Engineering Students’ Persistence through Photovoice Research. Sustainability, 15(6), 5311. https://doi.org/10.3390/su15065311
- Haugsbaken, H. and Hagelia, M., (2024) A New AI Literacy For The Algorithmic Age: Prompt Engineering Or Eductional Promptization?, 4th International Conference on Applied Artificial Intelligence (ICAPAI), Halden, Norway, 2024, pp. 1-8, doi: 10.1109/ICAPAI61893.2024.10541229.
- Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., & Zhu, J. (2022). DPM-Solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In Advances in Neural Information Processing Systems (Vol. 35, pp. 5775–5787). Retrieved May 07, 2026 from https://arxiv.org/abs/2206.00927
- Manovich, L. (2020). Cultural analytics. MIT Press. Retrieved May 07, 2026 from https://mitpress.mit.edu/9780262037105/cultural-analytics/
- Nichol, A. Q., & Dhariwal, P. (2021). Improved denoising diffusion probabilistic models. In Proceedings of the 38th International Conference on Machine Learning (pp. 8162–8171). Retrieved May 07, 2026 from https://proceedings.mlr.press/v139/nichol21a.html
- Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill. https://doi.org/10.23915/distill.00003
- Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. (2021). Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning (pp. 8748–8763). Retrieved May 07, 2026 from https://proceedings.mlr.press/v139/radford21a.html
- Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., & Chen, M. (2022). Hierarchical text-conditional image generation with CLIP latents. arXiv. Retrieved May 07, 2026 from https://arxiv.org/abs/2204.06125
- Saharia, C., Chan, W., Saxena, S., Li, L., Whang, J., Denton, E., Ghasemipour, S. K. S., Ayan, B. K., Mahdavi, S. S., Lopes, R. G., et al. (2022). Photorealistic text-to-image diffusion models with deep language understanding. In Advances in Neural Information Processing Systems (Vol. 35). Retrieved May 07, 2026 from https://arxiv.org/abs/2205.11487
- Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N., & Ganguli, S. (2015). Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning (pp. 2256–2265). Retrieved May 07, 2026 from https://proceedings.mlr.press/v37/sohl-dickstein15.html
- Shi, C. and Yang, S., (2023) LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models, IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023, pp. 2920-2929, doi: 10.1109/ICCV51070.2023.00274.
- Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 586–595). Retrieved May 07, 2026 from https://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_The_Unreasonable_Effectiveness_CVPR_2018_paper.html
- Zhan, ZZ., Xiong, YT., Wang, CY. et al. (2025). Utilizing GPT-4 to interpret oral mucosal disease photographs for structured report generation. Sci Rep 15, 5187 https://doi.org/10.1038/s41598-025-89328-y
