Keyword

Large Language Models

Explore 2 research publications tagged with this keyword

2Publications

6Authors

1Years

Publications Tagged with "Large Language Models"

2 publications found

2026

2 publications

Energy-Efficient Training of Large Language Models Through Sparse Attention and Low-Rank Adaptation (LoRA-S)

Anjani Kumar Tiwari and Pawar Harish

5/19/2026

pp. 19-42

January-June 2026 (Vol. 2, Issue 1, 2026)

Abstract: Large language models (LLMs) such as GPT and BERT have revolutionized natural language processing but impose enormous computational and energy costs due to their massive parameter sizes and quadratic attention complexity. This study introduces LoRA-S, a unified framework that combines Low-Rank Adaptation (LoRA) with sparse attention mechanisms to achieve energy-efficient and scaljournalable training of transformer-based LLMs. By freezing pretrained weights and injecting low-rank trainable matrices into attention and feed-forward layers, LoRA reduces the number of trainable parameters by over 90%, significantly lowering gradient computation and memory overhead. Simultaneously, sparse attention restricts token interactions to structured subsets, cutting attention-related FLOPs from 100 G to 7 G in WikiText-2 experiments. Comparative analysis across Full Fine-Tuning, LoRA, and LoRA-S demonstrates that LoRA-S achieves the lowest energy consumption of 22,380 J (6.22 Wh) while maintaining competitive task performance, with perplexity of 115.26 on WikiText-2 and sentiment classification accuracy of 73.90% on IMDB. Pareto frontier analysis confirms LoRA-S as an optimal trade-off between computational efficiency and predictive capability, enabling resource-constrained and eco-friendly model deployment. These results establish LoRA-S as a practical step toward Green AI, providing a novel, integrated approach to minimize FLOPs and parameter updates without substantially compromising LLM performance.

Large Language Models Low-Rank Adaptation (LoRA)Sparse Attention Energy-Efficient Training Green

HALLUCINATION IN LARGE LANGUAGE MODELS: CHARACTERIZATION, DETECTION, AND MITIGATION APPROACHES

Meenal Vardar et al.

3/3/2026

July-Dec 2025 (Vol. 1, Issue 1, 2025)

A significant barrier to preserving factual accuracy and dependability in AI-generated outputs is hallucination in large language models. Using a benchmark Kaggle dataset, this work provides a comprehensive evaluation of both advanced transformer-based architectures and traditional machine learning classifiers for hallucination identification. They compared refined transformer models, such as DistilBERT, RoBERTa, and DeBERTa, with baseline models, including Random Forest, SVM, and Logistic Regression. The results show that transformer-based models were more robust and better at understanding context; however, more conventional models, such as Random Forest, achieved a high overall accuracy of 94.10%. DistilBERT struck a wonderful balance between precision and readability. The confusion matrix analysis demonstrated that the models helped reduce false alarms for non-hallucination outputs. The ROC-AUC ratings confirmed the transformers’ precision and capability for identifying a slight rate of semantic discrepancies. Other studies provided supporting evidence that deeper context modeling will provide real benefits to the reliability of detection rates, demonstrated by the reduced hallucinations and assessments of the frequency of errors made. In conclusion, this research shows that combining traditional and modern approaches is beneficial and that tuning with transformer models holds promise for reducing hallucinations. This research provides an example of early steps of increasing trustworthiness and human-like models as AI models.

Hallucination Detection Large Language Models Transformer-Based Models Machine Learning Trustworthy AI

Keyword Statistics

Total Publications:2

Years Active:1

Latest Publication:2026

Contributing Authors:6

Related Keywords

machine learning artificial intelligence deep learning neural networks data science computer vision natural language processing robotics

Explore More