--- title: "Rules" author: "Prof. Eric A. Suess" date: "February 24, 2021" output: beamer_presentation: default ioslides_presentation: default --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = FALSE) ``` ## Introduction Today we will discuss Classification algorithms using Rules. We will learn about the following Rule bases algorithms. - ZeroR - 1R - RIPPER Note that the C5.0 Algorithm can be used for Rule Learners with the option **rules = TRUE**. ## Data Mining Map To see where we are in the class and to see what is to come, take a look at this flow chart. - [Data Mining Map](http://www.saedsayad.com/data_mining_map.htm) **Question:** What is the difference between and flowchart and a tree diagram? ## Rules From Decision Trees Earlier in chapter there was the example of classifying movies as Box Office Bust, Mainstream Hit, and Critical Success. See page 123/131. **Now Rules.** You can think of a rule as a path through a tree to a decision. See page 149/157 for rules that can be used to classify movies. There are other ways to come up with Rules. ## Decision Trees Decision Trees are built using the approach known as **Divide and Conquer**. Feature values are used to split the data into smaller and smaller subsets of similar cases. ## Rules Classification Rules use the approach called **Separate and Conquer**. According to the author... The process involves identifying a rule that covers a subset of the examples in the training data, and then separates this partition from the remaining data. As rules are added, additional subsets of data are separated until the entire dataset has been covered or no more examples remain. ## Rules Rules based learners usually use **nominal features** ## Greedy learners Both - **Divide-and-conquer** - **Separate-and-conquer** algorithms are known as **greedy learners** because data is used on a first-come, first serve basis. - from Wikipedia [greedy algorithms](https://en.wikipedia.org/wiki/Greedy_algorithm) ## One Rule algorithm **ZeroR** decide to pick the highest probability outcome. **OneR** develop a rule with each feature, use the one rule that has the best performance. ## RIPPER algorithm **RIPPER** Repeated Incremental Pruning to Produce Error Reduction 1. Grow 2. Prune 3. Optimize The *information gain* criterion is used to identify the next splitting attribute. When increasing rule's *specificity* no longer *reduces entropy*, then rule is immediately *pruned*. ## RIPPER algorithm For further details about the RIPPER Algorithm, see Cohen's paper and the following presentation. - [Fast Effective Rule Induction](http://www.cs.utsa.edu/~bylander/cs6243/cohen95ripper.pdf) - [A Ripper presentation](http://www.csee.usf.edu/~hall/dm/ripper.pdf) ## Today Today we will try the identifying poisonous mushrooms example. Check out some of my photos of mushrooms on flickr. [mushroom](https://flic.kr/p/DWZmZU) [mushroom](https://flic.kr/p/Db2sb2) [mushroom](https://flic.kr/p/DFiwP1) [mushroom](https://flic.kr/p/DyVPCa)