This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Ajay Krishnan T. K., School of Digital Sciences;
(2) V. S. Anoop, School of Digital Sciences.
Table of Links
- Abstract & Introduction
- Related Studies
- Materials and Methods
- Proposed Approach
- Results and Discussions
- Conclusions and References
2 Related Studies
This section provides an overview of recent and influential research papers in machine learning and natural language processing, specifically on climate change analysis. It also discusses relevant studies that explore sentiment text classification approaches that are pertinent to the proposed project. The reviewed studies have demonstrated the effectiveness of sentiment analysis and named entity recognition techniques in the context of climate change analysis. NLP models such as BERT and attention mechanisms have shown promising results in capturing contextual information and improving performance. These studies provide valuable insights and methodologies to guide our approach in implementing sentiment analysis and named entity recognition on climate change-related tweets and texts using the ClimateBERT pre-trained model.
A study was conducted to assess the effectiveness of ML algorithms in predicting long-term global warming. The research examined algorithms such as LR, SVR, lasso, and ElasticNet to connect average annual temperature and greenhouse gas factors. By analyzing a dataset spanning 100-150 years, the study found that carbon dioxide (CO2) had the most significant impact on temperature changes, followed by CH4, N2O, and SF6. Using this information, the researchers were able to forecast temperature trends and greenhouse gas levels for the next decade, providing valuable insights for mitigating the consequences of global warming. The research analyzes public sentiments regarding climate change by studying Twitter data. The study aims to tackle the problems of polarization and misinformation that often arise during climate change discussions on social media platforms. To achieve this, the researchers introduce a multitask model named MEMOCLiC, which combines stance detection with additional tasks like emotion recognition and offensive language identification. By employing various embedding techniques and attention mechanisms, the proposed framework effectively captures specific characteristics and interactions related to different modalities. Experimental findings highlight the superior performance of the MEMOCLiC model in enhancing stance detection accuracy compared to baseline methods.
This research paper examines the issue of polarization and belief systems prevalent in climate change discussions on Twitter. The paper proposes a framework that aims to identify statements denying climate change and classify tweets into two categories: denier or believer stances. [Sham and Mohamed, 2022]The framework focuses on two interconnected tasks: stance detection and sentiment analysis. Combining these tasks, the multi-task model utilizes feature-specific and shared-specific attention frameworks to acquire comprehensive features. Experimental results demonstrate that the proposed framework enhances stance detection accuracy by leveraging sentiment analysis, outperforming uni-modal and single-task approaches. This research paper utilizes the BERT model and convolutional neural network (CNN).[Lydiri et al., 2022] The study analyzes public opinions on climate change by examining Twitter data. The results indicate that the proposed model surpasses conventional machine learning methods, accurately identifying climate change believers and deniers. The authors suggest this model has significant potential for monitoring and governance purposes, particularly in smart city contexts. Additionally, future work involves investigating alternative deep learning algorithms and expanding the analysis to encompass other social media platforms.
This research paper [Ceylan, 2022]investigates the application of AI and NLP models to analyze extensive unstructured data concerning climate change. The study primarily aims to develop an information management system capable of extracting pertinent information from diverse data sources, particularly technical design documentation. By utilizing pre-trained AI-based NLP models trained on textual data and integrating non-textual graphical data, the researchers showcase the system’s effectiveness in swiftly and efficiently retrieving precise information. The ultimate objective is to promote knowledge democratization and ensure the accessibility of information to a broad user base. This research paper examines people’s emotions and opinions concerning the conflict between Russia and Ukraine by employing ML and DL techniques. [Sirisha and Bolem, 2022]The study introduces a novel hybrid model combining sequence and transformer models, namely ROBERTa, ABSA, and LSTM. To conduct the analysis, a large dataset of geographically tagged tweets related to the Ukraine-Russia war is collected from Twitter, and sentiment analysis is performed using the proposed model. The findings indicate that the hybrid model achieves a remarkable accuracy of 94.7, surpassing existing approaches in sentiment analysis. The study underscores the significance of social media platforms such as Twitter in gaining insights into public sentiment and opinions regarding global events.
This research paper aims to overcome the limitations of general language models in effectively representing climaterelated texts. The authors introduce CLIMATEBERT, a transformer-based language model that undergoes pretraining on a vast corpus of climate-related paragraphs extracted from diverse sources such as news articles, research papers, and corporate disclosures.[Webersinke et al., 2021] Comparative evaluations reveal that CLIMATEBERT surpasses commonly used language models, exhibiting a substantial 48 enhancement in a masked language model objective. The improved performance of CLIMATEBERT contributes to lower error rates in various climate-related downstream tasks. To encourage further research at the intersection of climate change and natural language processing, the authors provide public access to the training code and weights of CLIMATEBERT. This research paper uses ML algorithms to analyze and predict climate change. The authors emphasize the significance of comprehending and adapting to the impacts of climate change on both human society and the environment. The study discusses the application of ML methods in analyzing historical temperature data and carbon dioxide concentrations dating back to the 18th century. It emphasizes the potential advantages of employing machine learning and artificial intelligence in interpreting and harnessing climate data for simulations and predictions. Multiple machine learning algorithms, such as DT, RF, and ANN, are examined for climate change risk assessment and prediction. The authors conclude that integrating machine learning techniques can enhance climate modeling, enabling informed decision-making concerning climate change mitigation and adaptation strategies.
This research paper introduces the ClimaText dataset[Varini et al., 2020], specifically developed to detect sentence-level climate change topics within textual sources. The authors emphasize the significance of automating the extraction of climate change information from media and other text-based materials to facilitate various applications, including content filtering, sentiment analysis, and fact-checking. Through a comparative analysis of different approaches for identifying climate change topics, they find that context-based algorithms like BERT outperform simple keyword-based models. However, the authors also identify areas that require improvement, particularly in capturing the discussion surrounding the indirect effects of climate change. The authors anticipate this dataset will be a valuable resource for further research in natural language understanding and climate change communication. [Upadhyaya et al., 2022]It underscores the importance of comprehending public perception and acceptance of climate change policies. The study examines diverse data sources, such as social media, scientific papers, and news articles, to perform sentiment analysis. ML techniques, specifically SVM, are evaluated for extracting valuable insights from these data sources. The paper concludes that supervised machine learning techniques exhibit effectiveness in sentiment analysis, highlighting that ensemble and hybrid approaches yield superior outcomes compared to individual classifiers.