By: Falk Huettmann, Institute of Arctic Biology, University of Alaska Fairbanks

Introduction

For the Arctic and the polar regions, accurate data mining and forecasting are a basis for good decision-making. For instance, predicting changes in carbon dioxide emission levels and how those will impact climate, sea level, and biodiversity scenarios—as well as social and human dimensions—will help inform sound decisions for future generations. This is no easy feat when considering the deep connectivity that polar regions hold: Alaska and the Arctic are not isolated, stand-alone entities in the ecological polar, tropical, oceanic, atmospheric, and universal context but all are deeply connected (e.g., Raya Rey and Huettmann 2019). Artificial Intelligence (AI) and Machine Learning (ML) offer tools and new approaches to obtain powerful and best-possible predictions for inference and policy for the current time period, as well as for the future (Murphy et al. 2010, 2012; Aycrigg et al. 2015; Baltensperger et al. 2015; Morton and Huettmann 2018).

Artificial Intelligence (AI) and Machine Learning (ML) Defined

AI is a wide and evolving field that combines computer science and datasets—robust or new—to enable problem-solving. It also encompasses the established sub-fields of ML and deep learning; clear divisions among which are not easily defined and somewhat overlapping (Humphries et al. 2018). These disciplines are comprised of over 100 AI algorithms (Fernandez-Delgado et al. 2014), and their source code implementations, built into a computer that allows for the machine to identify data patterns and signals, and build testable high-quality predictions around the data that it identifies.

In a nutshell, machine learning is the concept that a computer program can learn and adapt to new data without human intervention and is rather useful in parsing, with little error, the immense amount of information that is consistently and readily available in the world to assist in up-to-date decision making. Those steps can be done automatically. Deep learning is a subset of machine learning, which usually is a neural network with three or more layers able to see and grasp every detail in an otherwise vast dataset. These neural networks attempt to simulate the behavior of the human brain allowing it to "learn" relevant patterns from large and complex amounts of data. "Big Data" refers to the underlying large, diverse sets of information that grow at ever-increasing rates, e.g., due to automated field sensors. It encompasses the volume of information, the velocity or speed at which it is created and collected, and the variety or scope of the data points being covered (for further information, see What is artificial intelligence?). AI/ML can handle many datasets and databases that are otherwise left unused or under-analyzed, often thought of as too small, too complex, or are incomplete datasets. Getting at information from such data is part of AI/ML and has been done successfully (e.g., Jiao et al. 2016).

Potential Applications of Artificial Intelligence and Machine Learning with an Arctic Focus

When model expertise gets linked with agencies and data centers (see Table 1) in a team of subject experts, AI and ML offer strong applications to Arctic science problems. The applications there are as diverse as—but not limited to—remote sensing (Robold and Huettmann 2021), biomass and biodiversity (Young et al. 2017, 2018), and sound and wilderness (Mullet et al. 2016).

Table 1. Applications and References for Typical AI/ML Projects with an Arctic Focus.
Application Citations Comment
Time series and habitat data for many locations worldwide Oppel and Huettmann 2010; Dornelas et al. 2018; Huettmann et al. 2018; Solovyeva et al. 2021; Global Biodiversity Information Facility (GBIF) Massive datasets (“Big Data”) exist already, ready to be used for species, habitats, and models alike.
Climate-based forecasting Baltensperger et al. 2015; Huettmann 2007; Morton and Huettmann 2018 One of the major approaches to learn about future scenarios, usually based on approaches mentioned below, e.g., Species Distribution Models.
Use of climate data for spatial predictions (current time) Young et al. 2017, 2018; Zhang et al. 2022 Basic applications include Species Distribution Models (SDMs), Resource Selection Functions (RSFs), Habitat Suitability Index (HIS) models and their derivatives.
Data Mining of Big Data Miller et al. 2014; Jiao et al. 2016; Huettmann 2018; Lata et al. 2022 A major method to obtain (new and additional) information from a dataset focusing on finding robust signals and patterns.
Generalization from complex and incomplete datasets Huettmann et al. 2011; Humphries and Huettmann 2014; Jiao et al. 2016 ML and AI can be used for small, tiny, marginal, and incomplete datasets.
Multispecies applications (marine) Wei et al. 2011; Miller et al. 2014 This is a topic where 100s of species can be modeled and inferred from; a major progress in handling the environment.
Multispecies applications (terrestrial) Aycrigg et al. 2015; Zabihi et al. 2021 See above.
Ensemble Models Hardy et al. 2011; Zabihi et al. 2021 A combination of models, the model ensemble, develops into a global standard for inference/generalization. For instance, the RandomForest algorithmbootstrap aggregating—can already be seen as an ensemble model by itself.
Synergy effects across disciplines Raya Rey and Huettmann 2019 For complex ecology, this approach offers itself as a major platform for inference and strategy.

Machine learning and artificial intelligence applications require certain steps to achieve; details shown in Figure 1. The figure depicts a commonly found workflow and cycle of inquiry that includes finding data, assessing data, modeling data, predicting and applying data, and documenting all data, models, and the process overall; all done in the cloud (Huettmann 2007; Huettmann et al. 2017; Humphries and Huettmann 2018; Gulyaeva et al. 2020).


Figure 1. A commonly found workflow and cycle of inquiry used for inference in machine learning and artificial intelligence projects (e.g., Bluhm et al. 2010; Gulyaeva 2020; Huettmann 2020). Figure by F. Huettmann.
Figure 1. A commonly found workflow and cycle of inquiry used for inference in machine learning and artificial intelligence projects (e.g., Bluhm et al. 2010; Gulyaeva 2020; Huettmann 2020). Figure by F. Huettmann.

Conclusion

The applications for AI/ML are much wider than just climate and impact forecasting. They are interdisciplinary and holistic, informing society, using and feeding public libraries, and affecting global sustainability. They also easily include disease/pandemic topics. Good AI/ML data analysts will actually develop repeatable and transparent globally applicable workflows for AI/ML models, data, and their applications so that forecasting of events—done in a strategic fashion—can be achieved (Bluhm et al. 2010; Huettmann 2020). It is less the model or the algorithm, as such, but the wider approach to the inherently linked Arctic problem to be addressed where AI/ML can excel and assist human thinking and avoid human errors in an automated fashion, even done online and globally accessible.


References

Aycrigg, J., G. Beauvais, T. Gotthardt, F. Huettmann, S. Pyare, M. Andersen, D. Keinath, J. Lonneker, M. Spathelf, and K. Walton. 2015. Novel Approaches to Modeling and Mapping Terrestrial Vertebrate Occurrence in the Northwest and Alaska: An Evaluation. Northwest Science 89:355–381. doi: /dx.doi.org/10.3955/046.089.0405

Baltensperger A. P. and F. Huettmann. 2015. Predicted Shifts in Small Mammal Distributions and Biodiversity in the Altered Future Environment of Alaska: An Open Access Data and Machine Learning. PLOS One DOI: 10.1371/journal.pone.0132054

Bluhm, B., D. Watts, and F. Huettmann. 2010. Free Database Availability, Metadata and the Internet: An Example of Two High Latitude Components of the Census of Marine Life. Chapter 13, pp. 233–244. In: S. Cushman and F. Huettmann. Spatial Complexity, Informatics and Wildlife Conservation. Springer Tokyo, Japan. pp. 233–244.

Dornelas M, L. H. Antão, F. Moyes et al. 2018. BioTIME: A Database of Biodiversity Time Series for the Anthropocene. Global Ecology and Biogeography. 00:1–26. https://doi.org/10.1111/geb.12729

Fernandez-Delgado, M., E. Cernadas, S. Barro and D. Amorim. 2014. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? 15: 3133–3181.

Gulyaeva, M., F. Huettmann, A. Shestopalov, M. Okamatsu, K. Matsuno, D.-H. Chu, Y. Sakoda, A. Glushchenko, E. Milton and E. Bortz. 2020. Data Mining and Model-Predicting a Global Disease Reservoir for Low-Pathogenic Avian Influenza (AI) in the Wider Pacific Rim Using Big Data Sets. Science Reports 10: 1681. https://doi.org/10.1038/s41598-020-73664-2

Hardy, S. M., M. Lindgren, H. Konakanchi and F. Huettmann. 2011. Predicting the Distribution and Ecological Niche of Unexploited Snow Crab (Chionoecetes opilio) Populations in Alaskan Waters: A First Open-Access Ensemble Model. Integrative and Comparative Biology 51(4): 608–622; doi: 10.1093/icb/icr102

Huettmann, F. 2007. The Digital Teaching Legacy of the International Polar Year (IPY): Details of a Present to the Global Village for Achieving Sustainability. Eds M. Tjoa and R. R.Wagner. Proceedings 18th International Workshop on Database and Expert Systems Applications (DEXA) 3–7 September 2007, Regensburg, Germany. IEEE Computer Society, Los Alamitos, CA. Pages 673–677.

Huettmann F. 2018. Advanced Data Mining (Cloning) of Predicted Climate-Scapes and their Variances Assessed with Machine Learning: An Example from Southern Alaska Shows Topographical Biases and Strong Differences. In: G. Humphries, D. R. Magness and F. Huettmann. Machine Learning for Ecology and Sustainable Natural Resource Management. pp 227–241.

Huettmann F. 2020. Chapter 24 - Investigating Matschie's Tree Kangaroos With 'Modern' Methods: Digital Workflows, Big Data Project Infrastructure, and Mandated Approaches for a Holistic Conservation Governance; In Dabeck L. P. Valentine, J. Blessington and K, Schwartzet (Eds) Tree Kangaroos: Science and Conservation. Academic Press. Pages 379–391

Huettmann F., E. E. Magnuson, and K. Hueffer. 2017. Ecological Niche Modeling of Rabies in the Changing Arctic of Alaska. Acta Veterinaria Scandinavica 59:18–31. DOI: 10.1186/s13028-017-0285-0

Huettmann F., C. Mi, and Yu Guo. 2018. "Batteries" in Machine Learning: A First Experimental Assessment of Inference for Siberian Crane Breeding Grounds in the Russian High Arctic Based on "Shaving" 74 Predictors. In: G. Humphries, D. R. Magness, and F. Huettmann. Machine Learning for Ecology and Sustainable Natural Resource Management 163–184.

Huettmann, F., Yu Artukhin, O. Gilg, and G. Humphries. 2011. Predictions of 27 Arctic Pelagic Seabird Distributions Using Public Environmental Variables, Assessed with Colony Data: A First Digital IPY and GBIF Open Access Synthesis Platform. Marine Biodiversity 41: 141–179. DOI 10.1007/s12526-011-0083-2

Humphries, G. R. W. and F. Huettmann. 2014. Putting Models to a Good Use: A Rapid Assessment of Arctic Seabird Biodiversity Indicates Potential Conflicts with Shipping Lanes and Human Activity. Diversity and Distributions 1-13.

Humphries, D. R. Magness, and F. Huettmann. 2018. Machine Learning for Ecology and Sustainable Natural Resource Management, Springer Gland, Switzerland.

Humphries, G. R. W. and F. Huettmann. 2018. Machine Learning and 'The Cloud' for Natural Resource Applications: Autonomous Online Robots Driving Sustainable Conservation. Management Worldwide? In: G. Humphries, D.R. Magness and F. Huettmann. Machine Learning for Ecology and Sustainable Natural Resource Management. 353–377.

Jiao, S., F. Huettmann, Y. Guo, X. Li, and Y. Ouyan. 2016. Advanced Long-Term Bird Banding and Climate Data Mining in Spring Confirm Passerine Population Declines for the Northeast Chinese-Russian Flyway. Global and Planetary Change 144 C: 17–33 DOI 10.1016/j.gloplacha.2016.06.015

Lata T. D., P. A. Deymier, K. Runge, R. Ferrière, and F. Huettmann. 2022. Topological Acoustic Sensing of Ground Stiffness: Presenting a Potential Means of Sensing Warming Permafrost in a Forest. Cold Regions Science and Technology 103569. ISSN 0165-232X. https://doi.org/10.1016/j.coldregions.2022.103569

Miller, K., F. Huettmann, B. Norcross, and M. Lorenz. 2014. Multivariate Random Forest Models of Estuarine-Associated Fish and Invertebrate Communities. MEPS 500: 159–174.

Morton, J. M., and F. Huettmann. 2018. Moose, Caribou and Sitka Black-Tailed Deer. Chapter 7 in G. D. Hayward, S. Colt, M. McTeague, and T. Hollingsworth (eds.). Climate Change Vulnerability Assessment for the Chugach National Forest and the Kenai Peninsula. General Technical Report PNW-GTR-000. USDA Forest Service, Pacific Northwest Research Station. Portland, Oregon.

Mullet T. C., J. M. Morton, S.H. Gage, F. Huettmann (2016) Acoustic Footprint of Snowmobile Noise and Natural Quiet Refugia in an Alaskan Wilderness. Natural Areas Journal 37:332-349

Murphy, K, F. Huettmann, N. Fresco, and J. Morton. 2010. Connecting Alaska Landscapes into the Future. U.S. Fish and Wildlife Service, and the University of Alaska. https://uaf-snap.org/wp-content/uploads/2020/06/SNAP_connectivity_2010…

Murphy, K., J. Reynolds, J. Jenkins, E. Whitten, N. Fresco, M. Lindgren, and F. Huettmann. 2012. Predicting Future Potential Climate-Biomes for the Yukon, Northwest Territories, and Alaska: A Climate-Linked Cluster Analysis Approach to Analyzing Possible Ecological Refugia and Areas of Greatest Change. Prepared by the Scenarios Network for Arctic Planning (SNAP) and the EWHALE lab, University of Alaska Fairbanks on behalf of The Nature Conservancy Canada., Government Northwest Territories. https://uaf-snap.org/wp-content/uploads/2020/06/Cliomes-FINAL.pdf

Oppel, S. and F. Huettmann. 2010. Using a Random Forest Model and Public Data to Predict the Distribution of Prey for Marine Wildlife Management. Chapter 8, pp. 151–164. In: S. Cushman and F. Huettmann, Spatial Complexity, Informatics and Wildlife Conservation, Springer Tokyo, Japan. pp. 151–164.

Raya Rey, A. and F. Huettmann. 2019. Telecoupling Analysis of the Patagonian Shelf: A New Approach to Study Global Seabird-Fisheries Interactions to Achieve Sustainability. Journal for Nature Conservation 3: https://www.sciencedirect.com/science/article/pii/S1617138118301067

Robold, R. and F. Huettmann. 2021. High-Resolution Prediction of American Red Squirrel in Interior Alaska: A Role Model for Conservation Using Open Access Data, Machine Learning, GIS and LIDAR. PRJ. https://peerj.com/articles/11830

Solovyeva, D., I. Bysykatova-Harmey, S. L. Vartanyan, A. Kondratyev, and F. Huettmann. 2021. Modeling Eastern Russian High Arctic Geese (Anser fabalis, A. albifrons) During Moult and Brood Rearing in the New Digital Arctic. Scientific Reports. https://www.nature.com/articles/s41598-021-01595-7

Wei, C. et al. (15 co-authors). 2011. A Global Analysis of Marine Benthos Biomass Using RandomForest. Public Library of Science (PLOS) 5:e15323.

Young, B, J. Yarie, D. Verbyla, F. Huettmann, K. Herrick and F. S. Chapin. 2017. Modeling and mapping forest diversity within the boreal forest of interior Alaska. Landscape Ecology 32: 397-413

Young, B. D., J. Yarie, D. Verbyla, F. Huettmann, and F. Stuart Chapin III. 2018. Mapping Aboveground Biomass of Trees Using Forest Inventory Data and Public Environmental Variables within the Alaskan Boreal Forest. In: G. Humphries, D. R. Magness and F. Huettmann. Machine Learning for Ecology and Sustainable Natural Resource Management. pp 141–160.

Zabihi, K., F. Huettmann, B. Young. 2021. Predicting Multi-Species Bark Beetle (Coleoptera: Curculionidae: Scolytinae) Occurrence in Alaska: First Use of Open Access Big Data Mining and Open-Source GIS to Provide Robust Inference and a Role Model for Progress in Forest Conservation. Biodiversity Informatics 1–15. https://journals.ku.edu/jbi/issue/current

Zhang, L., P. Sun, F. Huettmann, and S. Liu. 2022. Where Should China Practice Forestry in a Warming World? Global Change Biology, 00, 1–15. https://doi.org/10.1111/gcb.16065


About the Author

Falk HuettmannFalk Huettmann is a digital naturalist. With a MSc (Germany), PhD (Canada) and MBA (UAF), he works as a Wildlife Ecologist worldwide, on all continents, with a strong focus on research networks, and polar regions connecting with the tropics, oceans, and atmosphere. His three decade-long efforts combining remote field work with open-source geographic information systems (GIS), computing and machine learning has resulted into over 300 publications, many "massive" open access data sets and nine books, including a textbook on Machine Learning/AI and Open Access. Falk reviews for over 60 journals and publishers. His research and teaching have been awarded by Quantity Matters (QM), National Geographic, Killam Foundation (Canada), and various NGOs, agencies and governments, e.g., WWF, the EU Parliament, Environment Canada, Department of Fisheries and Oceans (Canada), US Fish & Wildlife Service, and the Global Environmental Fund (GEF).
Email: fhuettmann [at] alaska.edu; Phone: +1-907-474-7882