Two expert systems to forecast the avalanche hazard
Jurg Schweizer and Paul M.B. Fohn
Operational systems based on the statistical approach using a long term data base were developed in several countries and are widely used (Buser et al., 1987; Navarre et al., 1987; Merindol,1992, McClung and Tweedy, 1994) both for local and for regional avalanche forecasting. The two most popular methods are the discriminant analysis and the nearest neighbors (McClung and Schaerer, 1993). Snow and weather data is usually used together with observations of avalanche activity. It is assumed that similar snow and weather conditions should lead to similar avalanche situations. The output is the avalanche activity (i.e. the observed avalanches) of the similar historic situations found in the data base, often in the form of a prediction of "avalanche or non-avalanche day". In the case of regional avalanche forecasting this sort of output is difficult to relate to the actual hazard. Hence it is difficult to assess the real quality of these forecast models. They certainly improve the reflections of inexperienced forecasters and may influence experienced forecasters, but may rarely be called a decisive help in determining the degree of hazard for a region.
The aim of the purely deterministic approach is to simulate the avalanche release. On the basis of a snow cover model the avalanche formation is modeled using principles of fracture mechanics (Gubler and Bader, 1989). However, the present deterministic approaches are far from being able to link the single avalanche event to the regional avalanche hazard. Probably most successful is the French approach combining a snow cover model (Brun et al., 1992) with an expert system.
A combined approach, containing deterministic and statistical components has been developed by Fohn and Haechler (1978). The total loading by snowfall, wind action and the settlement is simulated in order to forecast large, dry snow avalanches.
Expert systems represent the idea of simulating the decision making process of an expert. Most of them are symbolic computing systems, i.e. use rules which were formulated explicitly by human experts e.g. MEPRA (Giraud, 1991) and AVALOG (Bolognesi, 1993).
The French system MEPRA analyzes the snow cover stratigraphy; the snow profiles are simulated by the snow cover model CROCUS (Brun et al. 1989) running with meteorological data provided by SAFRAN (Durand et al., 1993), a model for optimal interpolation of meteorological data.
Recently a hybrid expert system was developed using a neural network and rules extracted from the data base with neural network techniques (Schweizer et al., 1994).
3 A new approach using the CYBERTEK-COGENSYSTM Judgment Processor
In 1991 we worked out a completely new approach, more physically based,comparable to a deterministic system, that tries to model the reasoning of the avalanche forecaster: called MODUL. Both systems are based on a software for inductive decision making: CYBERTEK-COGENSYSTM Judgment Processor, details are given in Schweizer and Fohn (1994).
The CYBERTEK-COGENSYSTM Judgment Processor is a commercially available soft ware for inductive automatic decision making. It is based on the fact that pragmatic experts decide using their experience and intuition rather than explicit rules. The more complex a problem, the less structured is the knowledge. An expert is able to decide correctly and fast in a real situation. However he is usually not able to explain his decision completely by exact rules. The expert's approach is to choose the relevant data (which differs substantially from one situation to another), to classify and to analyze the data and finally to make a conclusion.
The expert building up the system defines the input data needed to reach a particular decision, the output, and the criteria that are used to categorize or evaluate the data; each input parameter has to be grouped in logical ranges (up to five ranges). The expert "teaches" the Judgment Processor by entering examples and interpreting the situations represented by those examples.
The Judgment Processor calculates the logical importance of each input parameter based on the observation of the mentor's decision. The logical importance is a measure of how a particular input parameter contributes to the logical model as a whole, based on how many situations within the knowledge base would become indistinguishable if that input parameter was removed. Based on the logical importance, given as a number from 1 ... 100, the parameters are classified as major or minor. The logical importance is continuously updated, so the system learns incrementally.
If a new situation is encountered the system tries to give a proposition for the possible decision on the basis of the past known situations. The similar situations are found by using the condition of similarity that prescribes that the majority of the values of the major input parameters has to be each in the same logical range. The quality of the proposed decision is described by the so-called confidence level, an indicator of how certain the system is that its interpretation is appropriate to the current situation: an exclamation mark (!) for very confident, a period (.) for reasonably confident or a question mark (?) for not confident. A low level of confidence suggests that there are few situations that the system considers to be logically similar, or that those situations that are similar have conflicting interpretations. Additionally the similar situations that are used to derive the decision with the according assertion level are also given. If the system is not able to find a decision on the basis of the present knowledge base it gives the result "not possible to make an interpretation-, in the following simply called "n.i." CYBERTEK COGENSYSTM, 1991).
The Judgment Processor's algorithm is not known in all details. Since the search for similar situations forms the core of the method, it may be called, in the broadest sense, a nearest neighbor method. However, the metric to search for similar situations differs substantially from the commonly used distance measure, e.g. the Euclidean distance. The categorization of the input data, the classification into major and minor parameters and the metric to-search similar situations are all non linear. The method is appropriate to deal with not independent, not normally distributed data. Briefly summarized the system weights and classifies the categorized data, searches for similar situations using strongly the classification and categorization, derives a result from the similar situations, describes the quality of the result and finally lists the similar situations used for deriving the result together with the pertinent similarity measure. The advantage of the method is the strong concentration on the input parameters that are considered as important.
In our case the judgment problem is the avalanche hazard and the input parameters are e.g. the 3-day-sum of new snow depth or the air temperature. A real situation is hence described by the set of input parameter values (weather and snow data) for the given day. The logical ranges in the case of the 3-day-sum of new snow depth are e.g. 0...10, 10...30, 30...60, 60...120 and more than 120cm. Finally, the decision or interpretation is the degree of hazard and additionally in the DAVOS model (see below) the altitude and the aspect of the most dangerous slopes.
The input parameters were chosen from a data set with 21 values which are believed to be representative for the region considered: 7 quantities are measured in the morning in the experimental plots of SFISAR at Weissfluhjoch 2600 m.a.s.l., 4 are prospective values for the day considered and 10 values describe the actual state of the snow cover based on slope measurements performed about every ten days. These principal data is given in Table 1.
The avalanche hazard is formulated first of all as degree of hazard ( 1... 7). Secondly, the lower limit of the primarily endangered altitudes is given in steps of usually 200m (> 1200, > 1600, > 1800, >2000, >2200, >2400, >2500, >2600, >2800 m.a.s.l.). Thirdly, the main aspect is described as either one of the mean directions (N, NE, E, SE, S, NW) and an according sector ( +/- 45 degrees, +/- 67 degrees, +/- 90 degrees) or as extreme slopes or all slopes. If the hazard is given e.g. as 4, > 2400 m.a.s./., NE +/- 90' degrees this means high hazard on slopes with aspect from north-west to south-east above 2400 m.a.s.l.
The avalanche hazard, as we use it, is the result of an "a posteriori" critical assessment of the hazard, the so-called verification.The verification has again the same structure as the warning. It is hardly possible to verify the avalanche hazard otherwise. Several studies on the verification of the avalanche hazard with the help of the so-called avalanche activity index were not sufficiently successful (Judson and King, 1985; Giraud et al. 1987; Remund, 1993). One reason is that in the case when no avalanches are present or observed, the avalanche hazard is not necessarily inexistent or very low. Hence it is obviously wrong to use the observed avalanche activity as sole output parameter in an assisting tool for regional avalanche forecasting.
Operationally the verification has been done some days later considering the observed avalanche activity (naturally and artificially released), the past weather conditions, the additional snow cover tests, the backcountry skiing activity and several other, partly personal observations. Snow cover tests form an important part of the verification work. The verification is an expert task itself and describes the avalanche situation for a given day probably still not yet exactly, but more accurately than the public avalanche forecast. Whereas the avalanche forecast is correct in about 70% of the days, the verification may be correct in about 90% of the days. By the way, the weather forecast achieves 80 to 85% of correct diagnosis.
Beside the input parameters we also have chosen the ranges for each of the input parameters according to our experience (Table 2). Based on the 9 year database we are finally able to check whether the chosen ranges were reasonable or not. One example, the 3-day-sum of new snow depth, is given in Figure 2. The situations with sum of new snow between 30 to 60cm and 60 to 1 20cm seem to be quite similar. In most situations the degree of hazard is 3 for both ranges. Hence it seems that these ranges do not categorize well. However, it is clear that the sum of new snow depth is only one of the input parameters that are all interconnected somehow, and that the avalanche hazard can not be determined by a sole input parameter.
The output parameter or result is the avalanche hazard described as degree of hazard, altitude and aspect of the most dangerous slopes.
The knowledge base of the DAVOS model consists of only real situations: the daily data of 9 winters (1 December to 30 April), totaling 1361 situations; 22 situations are two by two identical.
The original version of the model DAVOS was called DAVOS1. The experience with this version has given rise to develop further different versions (Schweizer and Fohn, 1994). The versions DAVOS2 concentrates on the first output result, the degree of hazard, whereas in the original version DAVOS1 all three results are equally important. In the DAVOS2 version the values of the logical importance seem to be closer to the general experience than in the DAVOS1 version where e.g. the 3-day-sum of new snow depth has no importance at all. The values of the logical importance of the original version DAVOS1 (Table 2) show clearly that this version is hardly able to discriminate. This fact seems definitely to be due to the desired output result that consists of three independent components.
DAVOS31 and DAVOS32 were born from the idea that it is generally important whether for a given day there is new snow or not. Accordingly the knowledge base was split into situations without new snow and ones with new snow.
Finally we tested a version (DAVOS4) that only gives the degree of hazard, and not also the altitude and the aspect of the most dangerous slopes. Due to the single type of output the version DAVOS4 should discriminate better than the other versions and hence should give better results.
First of all it is decisive whether there is new snow or not. Either the forecaster has to assess the new snow stability or he directly assesses - without new snow - the old snow stability which is often similar to the stability one day before, except if there is e.g. a large increase of heat transport and/or radiation. So he structures the input data according to the different steps in the decision process. If both the new snow stability and the old snow stability, including both the effect of the weather as forecast for today, are decided, the two release probabilities are combined. Taking into account the effect of the terrain and of the skier as trigger the degree of hazard is finally determined. At the moment only the degree of hazard is given; the altitude and the aspect as given in the DAVOS model is not yet implemented.
Each of the subproblems as e.g. Quality of new snow or Stability of old snow represents a judgment problem, as described above, and is hence principally structured as the model DAVOS. The different subproblems are just smaller than the DAVOS model, i.e. consist of only 3 to 8 input parameters. Often only 3 of the input parameters are considered as major parameters. This is a large advantage, since a much smaller knowledge base is necessary to get good interpretations and the system usually learns faster and better the logic behind the decision process. It is even possible to not only build up the knowledge base with real situations, but to construct realistic situations by varying the major input parameters in a reason able sense. This is impossible in the DAVOS model. So if the expert feels sure in one of the subproblems about the influence of one of the input parameters maybe in combination with another one, he may systematically construct realistic situations and decide systematically. But this means nothing else than including a rule, not explicitly, but implicitly. An example of such an implicit rule used in the subproblem Final merging is given in Table 3. This is of course a rather exhausting work, but the advantage is that the expert is more flexible in his decision as if he would use a strict explicit rule. It is easy e.g. to include non-linear relations. Furthermore it is possible to construct extreme, but still realistic situations that usually are rare, but of course very important. So one of the disadvantages of principally statistically based models using real data may be overcome. Finally you end up with a knowledge base that is a mixture of real, historic situations decided according to the verified hazard in those times and realistic situations directly decided according to the general knowledge and experience.
30 input parameters (Table 4) are used in 11 subproblems interconnected partially by rules. Some of the data are conventional data or mainly so-called low entropy data (LaChapelle, 1980), some are estimates of the weather development and more than one third is data on the structure of the snow cover. Hence to get all the data a user with certain skills and experience is required.
The output result of a subproblem is usually used as input parameter in another subproblem that appears later on in the decision process.
Many of the input parameter values are calculated using rules that depend them selves on the input values. The Overall critical depth e.g. depends on the 3-day sum of blowing snow depth that is only considered in certain situations when snow drift is likely.
Due to the modular structure it is easily possible to make modificationsin any of the subproblems. Additionally the relatively small number of input parameters in each subproblem enables the knowledge base to adapt rapidly to any modification, as e.g. adding a new input parameter.
So the important subproblem Influence of the skier is steadily improved according to the results of the specific study on the slab avalanche release triggered by the skier (Schweizer, 1993). In the subproblem Snow profile analysis the snow profile with Rutschblock test, representative for the region considered, is roughly interpreted, an aim that actually would need an expert system itself. 8 principal values (Table 1 ) are used exclusively for solving this subproblem. It should substitute together with the subproblem Stability of old snow the most important input parameter Index of snow cover stability in the DAVOS model. So this subproblem is under permanent improvement, too. Recently the Type of release and the Quality of the critical layer were introduced as input parameters.
In operational use, the model has to be run interactively by an experienced user. The model stops if the proposed decision in one of the subproblems does not have a high confidence level, and the user has to confirm the decision before the model continues to run. The final output result, the degree of hazard, is well explained by the output results of the different subproblems. If the model proposes a different degree of hazard than the user has independently estimated, the difference becomes usually obvious by inspecting the output results of the subproblems. Due to this feature the model is not at all a black box system, but a real supporting tool for the forecaster. The interactive use of the model proved to be very instructive.
To rate the interpretations provided by the system we defined the requirements of quality given in Table 5. Four steps of quality for the given interpretations are defined: good, fair, poor, and wrong. If the verified aspect is e.g. NE +/- 45 degrees, the rating in the following cases N +/- 67 degrees, NW +/- 90 degrees and S +/-90 degrees would be about right. not completely wrong and wrong respectively.
Considering the degree of hazard, the altitude and the aspect, the versions DAVOS1 and DAVOS2 have on the average a performance of about 65% and 70% good or fair (see Table 5 for definitions) interpretations respectively (Schweizer and F8hn, 1994).
To be able to compare the results of the versions DAVOS1 and DAVOS2 to the results of different systems, it is more convenient to only consider the degree of hazard. In that case in 52% and 54% of all situations the degree of hazard was correct compared to the verification for DAVOS1 and DAVOS2 respectively. 86% and 89% of all situations respectively are correct or deviate +/- 1 degree of hazard from the verification.
The versions DAVOS31 and DAVOS32, being complementary to one another, represent a certain improvement; the combined average performance is 61%.
The version DAVOS4 that only predicts the degree of hazard is on the average correct in 63% of all situations. This result represents the best performance of the different versions of model DAVOS. However, considering the performance degree by degree the result is rather disillusioning. The performance for the intermediate degrees 2 and 3 is only 55% and 57%, respectively. These degrees are of course most difficult to forecast. In the case of low or very high hazard the data is more often unambiguous. The extremes are easier to decide. However, since the extreme events at the upper margin of the scale are rare, the correctness is also not too good for these degrees of hazard (59%).
The performance results show quite clearly that probably all statistically based models based on real situations are notable to predict exceptional situations correctly, since this sort of situations are rare.
The experience shows that the more deterministic model MODUL is much more sensitive to single input parameters. A wrong input parameter or a wrong decision in one of the subproblems may have substantial consequences at the end, i.e. a change in the degree of hazard of 1 or 2 steps. So the reaction on a small change may sometimes be drastic. This is especially due to the smaller number of input parameters treated at once in a subproblem, also due to the fact that the output result of a subproblem often is used again as input in another subproblem, and partially due to the fact that the input data is strictly categorized. The latter problem might be removed by introducing fuzzy logic, i.e. defining blurred categories.
Figure 5 is a comparison of the correctness compared to the verified degree of hazard for the different forecasting models DAVOS1, DAVOS2, DAVOS4 and MODUL for the Davos area during the last three winters (1991/92 to 1993/94). It is clear that the more input parameters or the less complex the result, the better the performance.
The CYBERTEK-COGENSYSTM Judgement Processor - following a the idea of inductive decision making - proved to be a useful software for developing specific applications in the field of avalanche hazard assessment. Using weather, snow and snow cover data as input parameters the developed models evaluate the avalanche hazard for a given region. The new features are the choice of elaborated input parameters, especially more snow cover data, the categorization of the input data, the specific algorithm for the search for similar situations, and finally the concise output result. The avalanche hazard described as degree of hazard, altitude and aspect of the most endangered slopes, for the first time according to the scale used in the forecasts. This sort of output result is most efficient for the purpose of avalanche forecasting, it is much more appropriate to the problem than e.g. the output "avalanche/non-avalanche day". The use of observational avalanche data alone is insufficient for both the forecasting and the verification. The given output result is possible due to the effort of permanently verifying the avalanche hazard. The verification is the most striking feature and makes the data set - at the present time nine winters of weather, snow and snow cover data with the corresponding verified degree of hazard - a probably unique series.
The snow cover data proved to be very important. Actually it is well known that avalanche forecasting depends strona~ly on the state of the snow cover. However, except the French model MEPRA, until now none of the present models took into account this obvious fact. Of course this sort of data is not easily available but it is an illusion to expect that a supporting tool without any snow cover data is as powerful as the expert forecaster. Meteorology plays an important role, but not the decisive one.
The interactive use of the models proved to be a substantial advantage as especially the model MODUL is very instructive. It is well appropriate for the training of junior forecasters with a certain basic knowledge.
The model DAVOS - comparable to a statistical model - and the model MODUL - more comparable to a deterministic type of model - achieved a performance of about 60% and 70 to 75%, respectively. There exist no comparable or similar results. based on a long term operational test of any different system for regional avalanche forecasting.