Download talking tom trashix

R IN A NUTSHELL Second Edition Joseph Adler Beijing • Cambridge • Farnham • Köln • Sebastopol • Tokyo R in a Nutshell, Second Edition by Joseph Adler Copyright © Joseph Adler. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., Gravenstein Highway North, Sebastopol, CA O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (www.oxygen.com.ro). For more infor- mation, contact our corporate/institutional sales department: or corporate@www.oxygen.com.ro Editors: Mike Loukides and Meghan Blanchette Production Editor: Holly Bauer Proofreader: Julie Van Keuren Indexer: Fred Brown Cover Designer: Karen Montgomery Interior Designer: David Futato Illustrators: Robert Romano and Re- becca Demarest September First Edition. October Second Edition. Revision History for the Second Edition: First release See www.oxygen.com.ro?isbn= for release details. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trade- marks of O’Reilly Media, Inc. R in a Nutshell, the image of a harpy eagle, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. ISBN: [LSI] Table of Contents Preface xiii Part I. R Basics 1. Getting and Installing R 3 R Versions 3 Getting and Installing Interactive R Binaries 3 Windows 4 Mac OS X 5 Linux and Unix Systems 5 2. The R User Interface 7 The R Graphical User Interface 7 Windows 8 Mac OS X 8 Linux and Unix 9 The R Console 11 Command-Line Editing 13 Batch Mode 13 Using R Inside Microsoft Excel 14 RStudio 15 Other Ways to Run R 17 3. A Short R Tutorial 19 Basic Operations in R 19 Functions 21 Variables 22 iii Introduction to Data Structures 24 Objects and Classes 27 Models and Formulas 28 Charts and Graphics 30 Getting Help 35 4. R Packages 37 An Overview of Packages 37 Listing Packages in Local Libraries 38 Loading Packages 40 Loading Packages on Windows and Linux 40 Loading Packages on Mac OS X 40 Exploring Package Repositories 41 Exploring R Package Repositories on the Web 42 Finding and Installing Packages Inside R 42 Installing Packages From Other Repositories 45 Custom Packages 45 Creating a Package Directory 45 Building the Package 47 Part II. The R Language 5. An Overview of the R Language 51 Expressions 51 Objects 52 Symbols 52 Functions 52 Objects Are Copied in Assignment Statements 54 Everything in R Is an Object 55 Special Values 55 NA 55 Inf and -Inf 56 NaN 56 NULL 56 Coercion 56 The R Interpreter 57 Seeing How R Works 59 6. R Syntax 63 Constants 63 Numeric Vectors 63 Character Vectors 64 Symbols 65 Operators 66 Order of Operations 67 iv | Table of Contents Assignments 69 Expressions 69 Separating Expressions 69 Parentheses 70 Curly Braces 70 Control Structures 71 Conditional Statements 71 Loops 72 Accessing Data Structures 75 Data Structure Operators 75 Indexing by Integer Vector 76 Indexing by Logical Vector 78 Indexing by Name 79 R Code Style Standards 80 7. R Objects 83 Primitive Object Types 83 Vectors 86 Lists 87 Other Objects 88 Matrices 88 Arrays 89 Factors 89 Data Frames 91 Formulas 92 Time Series 94 Shingles 95 Dates and Times 95 Connections 96 Attributes 96 Class 99 8. Symbols and Environments Symbols Working with Environments The Global Environment Environments and Functions Working with the Call Stack Evaluating Functions in Different Environments Adding Objects to an Environment Exceptions Signaling Errors Catching Errors 9. Functions The Function Keyword Table of Contents | v Arguments Return Values Functions as Arguments Anonymous Functions Properties of Functions Argument Order and Named Arguments Side Effects Changes to Other Environments Input/Output Graphics Object-Oriented Programming Overview of Object-Oriented Programming in R Key Ideas Implementation Example Object-Oriented Programming in R: S4 Classes Defining Classes New Objects Accessing Slots Working with Objects Creating Coercion Methods Methods Managing Methods Basic Classes More Help Old-School OOP in R: S3 S3 Classes S3 Methods Using S3 Classes in S4 Classes Finding Hidden S3 Methods Part III. Working with Data Saving, Loading, and Editing Data Entering Data Within R Entering Data Using R Commands Using the Edit GUI Saving and Loading R Objects Saving Objects with save Importing Data from External Files Text Files Other Software Exporting Data Importing Data From Databases Export Then Import vi | Table of Contents Database Connection Packages RODBC DBI TSDBI Getting Data from Hadoop Preparing Data Combining Data Sets Pasting Together Data Structures Merging Data by Common Fields Transformations Reassigning Variables The Transform Function Applying a Function to Each Element of an Object Binning Data Shingles Cut Combining Objects with a Grouping Variable Subsets Bracket Notation subset Function Random Sampling Summarizing Functions tapply, aggregate Aggregating Tables with rowsum Counting Values Reshaping Data Data Cleaning Finding and Removing Duplicates Sorting Part IV. Data Visualization Graphics An Overview of R Graphics Scatter Plots Plotting Time Series Bar Charts Pie Charts Plotting Categorical Data Three-Dimensional Data Plotting Distributions Box Plots Graphics Devices Customizing Charts Table of Contents | vii Common Arguments to Chart Functions Graphical Parameters Basic Graphics Functions Lattice Graphics History An Overview of the Lattice Package How Lattice Works A Simple Example Using Lattice Functions Custom Panel Functions High-Level Lattice Plotting Functions Univariate Trellis Plots Bivariate Trellis Plots Trivariate Plots Other Plots Customizing Lattice Graphics Common Arguments to Lattice Functions www.oxygen.com.roon Controlling How Axes Are Drawn Parameters www.oxygen.com.ros www.oxygen.com.rot simpleKey Low-Level Functions Low-Level Graphics Functions Panel Functions ggplot2 A Short Introduction The Grammar of Graphics A More Complex Example: Medicare Data Quick Plot Creating Graphics with ggplot2 Learning More Part V. Statistics with R Analyzing Data Summary Statistics Correlation and Covariance Principal Components Analysis Factor Analysis Bootstrap Resampling viii | Table of Contents Probability Distributions Normal Distribution Common Distribution-Type Arguments Distribution Function Families Statistical Tests Continuous Data Normal Distribution-Based Tests Non-Parametric Tests Discrete Data Proportion Tests Binomial Tests Tabular Data Tests Non-Parametric Tabular Data Tests Power Tests Experimental Design Example t-Test Design Proportion Test Design ANOVA Test Design Regression Models Example: A Simple Linear Model Fitting a Model Helper Functions for Specifying the Model Getting Information About a Model Refining the Model Details About the lm Function Assumptions of Least Squares Regression Robust and Resistant Regression Subset Selection and Shrinkage Methods Stepwise Variable Selection Ridge Regression Lasso and Least Angle Regression elasticnet Principal Components Regression and Partial Least Squares Regression Nonlinear Models Generalized Linear Models glmnet Nonlinear Least Squares Survival Models Smoothing Splines Fitting Polynomial Surfaces Table of Contents | ix Kernel Smoothing Machine Learning Algorithms for Regression Regression Tree Models MARS Neural Networks Project Pursuit Regression Generalized Additive Models Support Vector Machines Classification Models Linear Classification Models Logistic Regression Linear Discriminant Analysis Log-Linear Models Machine Learning Algorithms for Classification k Nearest Neighbors Classification Tree Models Neural Networks SVMs Random Forests Machine Learning Market Basket Analysis Clustering Distance Measures Clustering Algorithms Time Series Analysis Autocorrelation Functions Time Series Models Part VI. Additional Topics Optimizing R Programs Measuring R Program Performance Timing Profiling Monitor How Much Memory You Are Using Profiling Memory Usage Optimizing Your R Code Using Vector Operations Lookup Performance in R Use a Database to Query Large Data Sets Preallocate Memory x | Table of Contents Cleaning Up Memory Functions for Big Data Sets Other Ways to Speed Up R The R Byte Code Compiler High-Performance R Binaries Bioconductor An Example Loading Raw Expression Data Loading Data from GEO Matching Phenotype Data Analyzing Expression Data Key Bioconductor Packages Data Structures eSet AssayData AnnotatedDataFrame MIAME Other Classes Used by Bioconductor Packages Where to Go Next Resources Outside Bioconductor Vignettes Courses Books R and Hadoop R and Hadoop Overview of Hadoop RHadoop Hadoop Streaming Learning More Other Packages for Parallel Computation with R Segue doMC Where to Learn More Appendix: R Reference Bibliography Index Table of Contents | xi Preface It’s been over 10 years since I was first introduced to R. Back then, I was a young product development manager at DoubleClick, a company that sold advertising software for managing online ad sales. I was working on inventory prediction: esti- mating the number of ad impressions that could be sold for a given search term, web page, or demographic characteristic. I wanted to play with the data myself, but we couldn’t afford a piece of expensive software like SAS or MATLAB. I looked around for a little while, trying to find an open-source statistics package, and stumbled on R. Back then, R was a bit rough around the edges and was missing a lot of the features it has today (like fancy graphics and statistics functions). But R was intuitive and easy to use; I was hooked. Since that time, I’ve used R to do many different things: estimate credit risk, analyze baseball statistics, and look for Internet security threats. I’ve learned a lot about data and matured a lot as a data analyst. R, too, has matured a great deal over the past decade. R is used at the world’s largest technology companies (including Google, Microsoft, and Facebook), the largest pharmaceutical companies (including Johnson & Johnson, Merck, and Pfizer), and at hundreds of other companies. It’s used in statistics classes at universities around the world and by statistics researchers to try new techniques and algorithms. Why I Wrote This Book This book is designed to be a concise guide to R. It’s not intended to be a book about statistics or an exhaustive guide to R. In this book, I tried to show all the things that R can do and to give examples showing how to do them. This book is designed to be a good desktop reference. I wrote this book because I like R. R is fun and intuitive in ways that other solutions are not. You can do things in a few lines of R that could take hours of struggling in a spreadsheet. Similarly, you can do things in a few lines of R that could take pages of Java code (and hours of Java coding). There are some excellent books on R, but xiii I couldn’t find an inexpensive book that gave an overview of everything you could do in R. I hope this book helps you use R. When Should You Use R? I think R is a great piece of software, but it isn’t the right tool for every problem. Clearly, it would be ridiculous to write a video game in R, but it’s not even the best tool for all data problems. R is very good at plotting graphics, analyzing data, and fitting statistical models using data that fits in the computer’s memory. It’s not as good at storing data in compli- cated structures, efficiently querying data, or working with data that doesn’t fit in the computer’s memory. Typically, I use a scripting language like Perl, Python, or Ruby to preprocess files before using them in R. (If the files are really big, I’ll use Pig.) It’s technically possible to use R for these problems (by reading files one line at a time and using R’s regular expression support), but it’s pretty awkward. To hold large data files, I usually use Hadoop. Sometimes I use a database like MySQL, PostgreSQL, SQLite, or Oracle (when someone else is paying the license fee). What’s New in the Second Edition? This edition isn’t a total rewrite of the first book. But I have tried to improve the book in a few significant ways: • There are new chapters on ggplot2 and using R with Hadoop. • Formatting changes should make code examples easier to read. • I’ve changed the order of the book slightly, grouping the plotting chapters to- gether. • I’ve made some minor updates to reflect changes in R and R • There are some new sections on useful tools for manipulating data in R, such as plyr and reshape. • I’ve corrected dozens of errors. xiv | Preface R License Terms R is an open-source software package, licensed under the GNU General Public License (GPL).1 This means that you can install R for free on most desktop and server machines. (Comparable commercial software packages sell for hundreds or thousands of dollars. If R were a poor substitute for the commercial software pack- ages, they might have limited appeal. However, I think R is better than its commercial counterparts in many respects.) Capability You can find implementations for hundreds (maybe thousands) of statistical and data analysis algorithms in R. No commercial package offers anywhere near the scope of functionality available through the Comprehensive R Archive Net- work (CRAN). Community There are now hundreds of thousands (if not millions) of R users worldwide. By using R, you can be sure that you’re using the same software your colleagues are using. Performance R’s performance is comparable, or superior, to most commercial analysis pack- ages. R requires you to load data sets into memory before processing. If you have enough memory to hold the data, R can run very quickly. Luckily, memory is cheap. You can buy 32 GB of server RAM for less than the cost of a single desktop license of a comparable piece of commercial statistical software. Examples In this book, I have tried to provide many working examples of R code. I deliberately decided to use new and original examples, instead of relying on the data sets included with R. I am not implying that the included examples are not good; they are good. I just wanted to give readers a second set of examples. In most cases, the examples are short and simple and I have not provided them in a downloadable form. How- ever, I have included example data and a few of the longer examples in the nut shell R package, available through CRAN. To install the nutshell package, type the following command on the R console: > www.oxygen.com.roes("nutshell") 1. There is some controversy about GPL licensed software and what it means to you as a corporate user. Some users are afraid that any code they write in R will be bound by the GPL. If you are not writing extensions to R, you do not need to worry about this issue. R is an interpreter, and the GPL does not apply to a program just because it is executed on a GPL-licensed interpreter. If you are writing extensions to R, they might be bound by the GPL. For more information, see the GNU foundation’s FAQ on the GPL: www.oxygen.com.ro However, for a definite answer, see an attorney. If you are worried about a specific application, see an attorney. Preface | xv How This Book Is Organized I’ve broken this book into parts: • Part I, R Basics, covers the basics of getting and running R. It’s designed to help get you up and running if you’re a new user, including a short tour of the many things you can do with R. • Part II, The R Language, picks up where the first section leaves off, describing the R language in detail. • Part III, Working with Data, covers data processing in R: loading data into R, transforming data, and summarizing data. • Part IV, Data Visualization, describes how to plot data with R. • Part V, Statistics with R, covers statistical tests and models in R. • Part VI, Additional Topics, contains chapters that don’t belong elsewhere: tun- ing R programs, writing parallel R programs, and Bioconductor. • Finally, I included an Appendix describing functions and data sets included with the base distribution of R. If you are new to R, install R and start with Chapter 3. Next, take a look at Chap- ter 5 to learn some of the rules of the R language. If you plan to use R for plotting, statistical tests, or statistical models, take a look at the appropriate chapter. Make sure you look at the first few sections of the chapter, because these provide an over- view of how all the related functions work. (For example, don’t skip straight to “Random forests for regression” on page without reading “Example: A Simple Linear Model” on page ) Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings as well as within paragraphs to refer to program ele- ments such as variable or function names, databases, data types, environment variables, statements, and keywords. (When showing input and output on the R console, I use constant width text to show prompts and other information produced by the R interpreter.) Constant width bold Shows commands or other text that should be typed literally by the user. (When showing input and output on the R console, I use constant width bold text to show you what I typed, including comments.) Constant width italic Shows text that should be replaced with user-supplied values or by values de- termined by context. xvi | Preface This icon indicates a tip, suggestion, or general note. This icon indicates a warning or a caution. In this book, I will sometimes show commands that I entered on my operating system prompt (i.e., in a Bash shell on Linux), and sometimes show commands that I en- tered in the R console. For commands that I entered in the operating system shell, I use a $ character to show the prompt; for commands entered in the R console, I will use > or + to show the prompt. (In either case, don’t type the prompt character.) Using Code Examples This book is here to help you get your job done. In general, you may use the code in this book in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For ex- ample, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quot- ing example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission. We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “R in a Nutshell by Joseph Adler. Copyright Joseph Adler, ” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@www.oxygen.com.ro Safari® Books Online Safari Books Online (www.oxygen.com.ro) is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business. Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training. Safari Books Online offers a range of product mixes and pricing programs for or- ganizations, government agencies, and individuals. Subscribers have access to thou- sands of books, training videos, and prepublication manuscripts in one fully search- able database from publishers like O’Reilly Media, Prentice Hall Professional, Preface | xvii Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Red- books, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and dozens more. For more information about Safari Books Online, please visit us online. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. Gravenstein Highway North Sebastopol, CA (in the United States or Canada) (international or local) (fax) We have a web page for this book where we list errata, examples, and any additional information. You can access this page at www.oxygen.com.ro To comment or to ask technical questions about this book, send email to bookquestions@www.oxygen.com.ro For more information about our books, courses, conferences, and news, see our website at www.oxygen.com.ro Find us on Facebook: www.oxygen.com.ro Follow us on Twitter: www.oxygen.com.ro Watch us on YouTube: www.oxygen.com.ro Acknowledgments First, I’d like to thank everyone who read the first book. I wrote R in a Nutshell to be useful. I tried to write the book that I wanted to read; I tried my best to share as much useful information as I could about R. That’s an ambitious goal, and I wrote an imperfect book. I appreciate all the feedback, suggestions, and corrections that I have received from readers and have tried my best to improve the book in the second edition. I’d like to thank the team at O’Reilly for their support. Tim O’Reilly has said that he follows three guiding principles: work on something that matters to you more than money, create more value than you capture, and take the long view.2 I tried to follow these principles when writing this book. As an author, I felt like the team at O’Reilly followed these principles. My goal in writing R in a Nutshell was to write the best book I could write. I hope that when people read this book, they learn something new and use what they learned to solve important problems. 2. See www.oxygen.com.ro xviii | Preface Many people helped support the writing of this book. First, I’d like to thank all of my technical reviewers. These folks check to make sure the examples work, look for technical and mathematical errors, and make many suggestions on writing quality. It’s not possible to write a quality technical book without quality technical reviewers: Peter Goldstein, Aaron Mandel, and David Hoaglin are the reason that this book reads as well as it does. For the past two years, I’ve worked at LinkedIn, ground zero for the data revolution. I’ve learned a huge amount working side by side with people like DJ Patil, Monica Rogati, Daniel Tunkelang, Sam Shah, and Jay Kreps. I’ve had the chance to discover interesting patterns, figure out how to share them with other people, and figure out how to scale my programs to work for hundreds of millions of users. I hope the second edition of this book reflects some of the lessons that I’ve learned on data, and helps other people learn the same things. I’d like to thank Randall Munroe, author of the xkcd comic. He kindly allowed us to reprint two of his (excellent) comics in this book. You can find his comics (and assorted merchandise) at www.oxygen.com.ro Additionally, I’d like to thank everyone who provided or suggested improvements. Aaron Schatz of Football Outsiders provided me with play-by-play data from the NFL season (the field goal data is from its database). Sandor Szalma of Johnson & Johnson suggested GSE as an example of gene expression data. Jeremy Ho- ward of Kaggle suggested adding glmnet. Finally, I’d like to thank my wife, Sarah, my daughter, Zoe, and my son, Zeke. Writing a book takes a lot of time, and they were very understanding when I needed to work. They were also very understanding when I dragged them to the San Diego Zoo to look at the harpy eagles. Preface | xix I R Basics This part of the book covers the basics of R: how to get R, how to install it, and how to use packages in R. It also includes a quick tutorial on R and an overview of the features of R. 1Getting and Installing R This chapter explains how to get R and how to install it on your computer. R Versions Today, R is maintained by a team of developers around the world. Usually, there is an official release of R twice a year, in April and in October. I’ve checked the code in this book against , but if you have an earlier or later version of R installed, don’t worry. R hasn’t changed that much in the past few years: usually there are some bug fixes, some optimizations, and a few new functions in each release. There have been some changes to the language, but most of these are related to somewhat obscure features that won’t affect most users. (For example, the type of NA values in incompletely initialized arrays was changed in R ) Don’t worry about using the exact version of R that I used in this book; any results you get should be very similar to the results shown in this book. If there are any changes to R that affect the examples in this book, I’ll try to add them to the official errata online. Additionally, I’ve given some example filenames below for the current release. The filenames usually have the release number in them. So don’t worry if you’re reading this book and don’t see a link for Rwinexe but see a link for R winexe instead; just use the latest version and you should be fine. Getting and Installing Interactive R Binaries R has been ported to every major desktop computing platform. Because R is open source, developers have ported R to many different platforms. Additionally, R is available with no license fee. If you’re using a Mac or a Windows machine, you’ll probably want to download the files yourself and then run the installers. (If you’re using Linux, I recommend using 3 a port management system like Yum to simplify the installation and updating pro- cess; see “Linux and Unix Systems” on page 5.) Here’s how to find the binaries. 1. Visit the official R website. On the site, you should see a link to “Download.” 2. The download link actually takes you to a list of mirror sites. The list is organ- ized by country. You’ll probably want to pick a site that is geographically close, because it’s likely to also be close on the Internet, and thus fast. I usually use the link for the University of California, Los Angeles, because I live in California. 3. Find the right binary for your platform and run the installer. There are a few things to keep in mind, depending on what system you’re using. Building R from Source It’s standard practice to build R from source on Linux and Unix systems, but not on Mac OS X or Windows platforms. It’s pretty tricky to build your own binaries on Mac OS X or Windows, and it doesn’t yield a lot of benefits for most users. Building R from source won’t save you space (you’ll probably have to download a lot of other stuff, like LaTeX), and it won’t save you time (unless you already have all the tools you need and have a really, really slow Internet connection). The best reason to build your own binaries is to get better performance out of R, but I’ve never found R’s performance to be a problem, even on very large data sets. If you’re interested in how to build your own R, see “Building your own” on page Windows Installing R on Windows is just like installing any other piece of software on Win- dows, which means that it’s easy if you have the right permissions, difficult if you don’t. If you’re installing R on your personal computer, this shouldn’t be a problem. However, if you’re working in a corporate environment, you might run into some trouble. If you’re an “Administrator” or “Power User” on Windows XP, installation is straightforward: double-click the installer and follow the on-screen instructions. There are some known issues with installing R on Microsoft Windows Vista. In particular, some users have problems with file permissions. Here are two approaches for avoiding these issues: • Install R as a standard user in your own file space. This is the simplest approach. • Install R as the default Administrator account (if it is enabled and you have access to it). Note that you will also need to install packages as the Administrator user. For a full explanation, see www.oxygen.com.ro #Does-R-run-under-Windows-Vista_f. Currently, CRAN releases only bit builds of R for Microsoft Windows. These are tested on bit versions of Windows and should run correctly. 4 | Chapter 1: Getting and Installing R Mac OS X The current version of R runs on both PowerPC- and Intel-based Mac systems run- ning Mac OS X (Leopard) and higher. If you’re using an older operating system, or an older computer, you can find older versions on the website that may work better with your system. You’ll find three different R installers for Mac OS X: a three-way universal binary for Mac OS X (Leopard) and higher, a legacy universal binary for Mac OS X and higher with supplemental tools, and a legacy universal binary for Mac OS X and higher without supplemental tools. See the CRAN download site for more details on the differences among these versions. As with most applications, you’ll need to have the appropriate permissions on your computer to install R. If you’re using your personal computer, you’re probably OK: you just need to remember your password. If you’re using a computer managed by someone else, you may need that person’s help to install R. The universal binary of R is made available as an installer package; simply download the file and double-click the package to install the application. The legacy R installers are packaged on a disk image file (like most Mac OS X applications). After you download the disk image, double-click it to open it in the finder (if it does not au- tomatically open). Open the volume and double-click the www.oxygen.com.ro icon to launch the installer. Follow the directions in the installer, and you should have a working copy of R on your computer. Linux and Unix Systems Before you start, make sure that you know the system’s root password or have sudo privileges on the system you’re using. If you don’t, you’ll need to get help from the system administrator to install R. Installation using package management systems On a Linux system, the easiest way to install R is to use a package management system. These systems automate the installation process: they fetch the R binaries (or sources), get any other software that’s needed to run R, and even make upgrading to the latest version easy. For example, on Red Hat (or Fedora), you can use Yum (which stands for “Yellowdog Updater, Modified”) to automate the installation. For example, on a bit x86 Linux platform running Linux, open a terminal window and type: $ sudo yum install R.x86_64 You’ll be prompted for your password, and if you have sudo privileges, R should be installed on your system. Later, you can update R by typing: $ sudo yum update R.x86_64 And, if there is a new version available, your R installation will be upgraded to the latest version. Getting and Installing Interactive R Binaries | 5 Installing R If you’re using another Unix system, you may also be able to install R. (For example, R is available through the FreeBSD Ports system at www.oxygen.com.ro .cgi/ports/math/R/.) I haven’t tried these versions but have no reason to think they don’t work correctly. See the documentation for your system for more information about how to install software. Installing R from downloaded files If you’d like, you can manually download R and install it later. Currently, there are precompiled R packages for several flavors of Linux, including Red Hat, Debian, Ubuntu, and SUSE. Precompiled binaries are also available for Solaris. On Red Hat–style systems, you can install these packages through the Red Hat Package Manager (RPM). For example, suppose that you downloaded the file Rfcirpm to the directory ~/Downloads. Then you could install it with a command like: $ rpm -i ~/Downloads/Rfcirpm For more information on using RPM, or other package management systems, see your user documentation. 6 | Chapter 1: Getting and Installing R 2The R User Interface If you’re reading this book, you probably have a problem that you would like to solve in R. You might want to: • Check the statistical significance of experimental results • Plot some data to help understand it better • Analyze some genome data The R system is a software environment for statistical computing and graphics. It includes many different components. In this book, I’ll use the term “R” to refer to a few different things: • A computer language • The interpreter that executes code written in R • A system for plotting computer graphics described using the R language • The Windows, Mac OS, or Linux application that includes the interpreter, graphics system, standard packages, and user interface This chapter contains a short description of the R user interface and the R console and describes how R varies on different platforms. If you’ve never used an interactive language, this chapter will explain some basic things you will need to know in order to work with R. We’ll take a quick look at the R graphical user interface (GUI) on each platform and then talk about the most important part: the R console. The R Graphical User Interface Let’s get started by launching R and taking a look at R’s graphical user interface on different platforms. When you open the R application on Windows or Max OS X, you’ll see a command window and some menu bars. On most Linux systems, R will simply start on the command line. 7 Windows By default, R is installed into %ProgramFiles%R (which is usually C:\Program Files \R) and installed into the Start menu under the group R. When you launch R in Windows, you’ll see something like the user interface shown in Figure Inside the R GUI window, there is a menu bar, a toolbar, and the R console. Figure R user interface on Windows XP Mac OS X The default R installer will add an application called R to your Applications folder that you can run like any other application on your Mac. When you launch the R application on Mac OS X systems, you’ll see something like the screen shown in Figure Like the Windows system, there is a menu bar, a toolbar with common functions, and an R console window. On a Mac OS system, you can also run R from the terminal without using the GUI. To do this, first open a terminal window. (The terminal program is located in the Utilities folder inside the Applications folder.) Then enter the command “R” on the command line to start R. 1. Yes, these are old screen shots. R has not changed very much, so we kept these the same in the second edition. 8 | Chapter 2: The R User Interface Linux and Unix On Linux systems, you can start R from the command line by typing: $ R Notice that it’s a capital “R”; filenames on Linux are case sensitive. (And don’t type the “$” character; that’s just the Unix prompt.) Unlike the default applications for Mac OS and Windows, this will start an inter- active R session on the command line itself. If you prefer, you can launch R in an application window similar to the user interface on other platforms. To do this, use the following command: $ R -g Tk & This will launch R in the background running in its own window, as shown in Figure Like the other platforms, there is a menu bar with some common func- tions, but unlike the other platforms, there is no toolbar. The main window acts as the R console. Figure R user interface on Mac OS X The R Graphical User Interface | 9 R User Interface Figure The interface for R on Fedora Additional R GUIs If you’re a typical desktop computer user, you might find it surprising to discover how little functionality is implemented in the standard R GUI. The standard R GUI implements only very rudimentary functionality through menus: reading help, managing multiple graphics windows, editing some source and data files, and some other basic functionality. There are no menu items, buttons, or palettes for loading data, transforming data, plotting data, building models, or doing any interesting work with data. Commercial applications like SAS, SPSS, and S-PLUS include UIs with much more functionality. Several projects are aiming to build an easier-to-use GUI for R: Rcmdr The Rcmdr project is an R package that provides an alternative GUI for R. You can install it as an R package. It provides some buttons for loading data and menu items for many common R functions. Rkward Rkward is a slick GUI front end for R. It provides a palette and menu-driven UI for analysis, data-editing tools, and an IDE for R code development. It’s still a young project and currently works best on Linux platforms (though Windows builds are available). It is available from www.oxygen.com.ro mediawiki/rkward/. 10 | Chapter 2: The R User Interface R Productivity Environment Revolution Computing recently introduced a new IDE called the R Produc- tivity Environment. This IDE provides many features for analyzing data: a script editor, object browser, visual debugger, and more. The R Productivity Environment is currently available only for Windows, as part of Revolution R Enterprise. RStudio RStudio is a popular, open source IDE for working with R. To learn more, see “RStudio” on page You can find a list of additional projects at www.oxygen.com.ro This book does not cover any of these projects in detail. However, you should still be able to use this book as a reference for all of these packages because they all use (and expose) R functions. The R Console The R console is the most important tool for using R. The R console is a tool that allows you to type commands into R and see how the R system responds. The com- mands that you type into the console are called expressions. A part of the R system called the interpreter will read the expressions and respond with a result or an error message. Sometimes, you can also enter an expression into R through the menus. If you’ve used a command line before (for example, the www.oxygen.com.ro program on Win- dows) or a language with an interactive interpreter such as LISP, this should look familiar.2 If not, don’t worry. Command-line interfaces aren’t as scary as they look. R provides a few tools to save you extra typing, to help you find the tools you’re looking for, and to spot common mistakes. Besides, you have a whole reference book on R that will help you figure out how to do what you want. Personally, I think a command-line interface is the best way to analyze data. After I finish working on a problem, I want a record of every step that I took. (I want to know how I loaded the data, if I took a random sample, how I took the sample, whether I created any new variables, what parameters I used in my models, etc.) A command-line interface makes it very easy to keep a record of everything I do and then re-create it later if I need to. When you launch R, you will see a window with the R console. Inside the console, you will see a message like this: R version () -- "Roasted Marshmallows" Copyright (C) The R Foundation for Statistical Computing ISBN Platform: x86_apple-darwin/x86_64 (bit) R is free software and comes with ABSOLUTELY NO WARRANTY. 2. Incidentally, R has quite a bit in common with LISP: both languages allow you to compute expressions on the language itself, both languages use similar internal structures to hold data, and both languages use lots of parentheses. The R Console | 11 R User Interface You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'www.oxygen.com.ro()' for an HTML browser interface to help. Type 'q()' to quit R. [www.oxygen.com.ro GUI () x86_apple-darwin] [History restored from /Users/jadler/www.oxygen.com.roy] This window displays some basic information about R: the version of R you’re run- ning, some license information, quick reminders about how to get help, and a com- mand prompt. By default, R will display a greater-than sign (“>”) in the console (at the beginning of a line, when nothing else is shown) when R is waiting for you to enter a command into the console. R is prompting you to type something, so this is called a prompt. For example, suppose that you typed 17 + 3 on the console. You would see some- thing similar to this: > 17 + 3 [1] 20 This means: • I entered “17 + 3” into the R command prompt. • The computer responded by writing “[1] 20” (I’ll explain what that means in Chapter 3). If you would like to try this yourself, then type “17 + 3” at the command prompt and press the Enter key. You should see a response like the one shown above. In this book, I will show text that I have typed in boldface. So, when you see an entry like this in the book: > 17 + 3 [1] 20 that means that I typed “17 + 3” into the console but that all the other text was generated by R. (Your terminal probably won’t display text you have entered in bold.) Sometimes, an R command doesn’t fit on a single line. If you enter an incomplete command on one line, the R prompt will change to a plus sign (“+”). Here’s a simple example: > 1 * 2 * 3 * 4 * 5 * + 6 * 7 * 8 * 9 * 10 [1] 12 | Chapter 2: The R User Interface This could cause confusion in some cases (such as in long expressions that contain sums or inequalities). On most platforms, command prompts, user-entered text, and R responses are displayed in different colors to help clarify the differences. Table presents a summary of the default colors. Table Text colors in R interactive mode Platform Command prompt User input R output Mac OS X Purple Blue Black Microsoft Windows Red Red Blue Linux Black Black Black Command-Line Editing On most platforms, R provides tools for looking through previous commands.3 You will probably find the most important line edit commands are the up and down arrow keys. By placing the cursor at the end of the line, you can scroll through commands by pressing the up arrow or the down arrow. The up arrow lets you look at earlier commands, and the down arrow lets you look at later commands. If you would like to repeat a previous command with a minor change (such as a different parameter), or if you need to correct a mistake (such as a missing parenthesis), you can do this easily. You can also type history() to get a list of previously typed commands.4 R also includes automatic completions for function names and filenames. Type the Tab key to see a list of possible completions for a function or a filename. Batch Mode R’s interactive mode is convenient for most ad hoc analyses, but typing in every command can be inconvenient for some tasks. Suppose that you wanted to do the same thing with R multiple times. (For example, you may want to load data from an experiment, transform it, generate three plots as Portable Document Format [PDF] files, and then quit.) R provides a way to run a large set of commands in sequence and save the results to a file. This is called batch mode. One way to run R in batch mode is from the system command line (not the R con- sole). By running R from the system command line, it’s possible to run a set of commands without starting R. This makes it easier to automate analyses, as you can change a couple of variables and rerun an analysis. For example, to load a set of commands from the file generate_graphs.R, you would use a command like this: 3. On Linux and Mac OS X systems, the command line uses the GNU readline library and includes a large set of editing commands. On Windows platforms, a smaller number of editing commands is available. 4. As of this writing, the history command does not work completely correctly on Mac OS X. The history command will display the last saved history, not the history for the current session. Batch Mode | 13 R User Interface $ R CMD BATCH generate_graphs.R R would run the commands in the input file generate_graphs.R, generating an output file called generate_www.oxygen.com.ro with the results. You can also specify the name of the output file. For example, to put the output in a file labeled with today’s date (on a Mac or Unix system), you could use a command like this: $ R CMD BATCH generate_graphs.R generate_graphs_`date "+%y%m%d"`.log If you’re generating graphics in batch mode, remember to specify the output device and filenames. For more information about running R from the command line, in- cluding a list of the available options, run R from the command line with the --help option: $ R --help One key disadvantage of running R using the command R CMD BATCH is that your scripts cannot access the system’s standard input. Luckily, there is a second com- mand for running R in batch mode: the RScript command. You can execute a script with a command like this: $ RScript generate_graphs.R Additionally, you can write executable scripts using RScript. Here’s an example of how to do this (on Linux, Mac OS, or other Unix-like systems). First, create a file called hello_world.R with the following contents: #! /usr/bin/env RScript print("Hello world!"); Next, type the following command to make the script executable: $ chmod +x hello_world.R Now you can execute this command like any other command: $ ./hello_world.R [1] "Hello world!" We will use this ability in “Hadoop Streaming” on page Finally, you can also run commands in batch mode from inside R. To do this, you can use the source command; see the help file for source for more information. Using R Inside Microsoft Excel If you’re familiar with Microsoft Excel, or if you work with a lot of data files in Excel format, you might want to run R directly from inside Excel. The RExcel software lets you do just that (on Microsoft Windows systems). You can find information about this software at www.oxygen.com.ro This site also includes a single in- staller that will install R plus all the other software you need to use RExcel. If you already have R installed, you can install RExcel as a package from CRAN. The following set of commands will download RExcel, configure the RCOM server, in- stall RDCOM, and launch the RExcel installer: 14 | Chapter 2: The R User Interface > www.oxygen.com.roes("RExcelInstaller", "rcom", "rsproxy") > # configure rcom > library(rcom) > comRegisterRegistry() > library(RExcelInstaller) > # execute the following command in R to start the installer for RDCOM > installstatconnDCOM() > # execute the following command in R to start the installer for REXCEL > installRExcel() Follow the prompts within the installer to install RExcel. After you have installed RExcel, you will be able to access RExcel from a menu item. If you are using Excel , you will need to select the “Add-Ins” ribbon to find this menu, as shown in Figure To use RExcel, first select the R Start menu item. As a simple test, try doing the following: 1. Enter a set of numeric values into a column in Excel (for example, B1:B5). 2. Select the values you entered. 3. On the RExcel menu, go to the item Put R Var → Array. 4. A dialog box will open, asking you to name the object you are creating in Excel. Enter v and press the Enter key. This will create an array (in this case, just a vector) in R with the values that you entered with the name v. 5. Now, select a blank cell in Excel. 6. On the RExcel menu, go to the item Get R Value → Array. 7. A dialog box will open, prompting you to enter an R expression. As an example, try entering (v - mean(v)) / sd(v). This will rescale the contents of v, changing the mean to 0 and the standard deviation to 1. 8. Inspect the results that have been returned within Excel. For some more interesting examples of how to use RExcel, take a look at the Demo Worksheets under this menu. You can use Excel functions to evaluate R expressions, use R expressions in macros, and even plot R graphics within Excel. RStudio One of the most popular ways to run R has become RStudio. RStudio is a free, open- source integrated development environment (IDE) for R. A screen shot of R Studio is shown in Figure Unlike the standard R GUI, RStudio tiles windows on the screen and puts different windows in different tabs. Additionally, you can install RStudio on a Linux server and access R from a web browser! To learn more about RStudio and download a copy, see www.oxygen.com.ro RStudio | 15 R User Interface Figure Accessing RExcel in Microsoft Excel Figure R Studio 16 | Chapter 2: The R User Interface Other Ways to Run R There are several open-source projects that allow you to combine R with other applications: As a web application The rApache software allows you to incorporate analyses from R into a web application. (For example, you might want to build a server that shows sophis- ticated reports using R lattice graphics.) For information about this project, see www.oxygen.com.ro As a server The Rserve software allows you to access R from within other applications. For example, you can produce a Java program that uses R to perform some calcu- lations. As the name implies, Rserve is implemented as a network server, so a single Rserve instance can handle calculations from multiple users on different machines. One way to use Rserve is to install it on a heavy-duty server with lots of CPU power and memory, so that users can perform calculations that they couldn’t easily perform on their own desktops. For more about this project, see www.oxygen.com.ro As we described above, you can also use R Studio to run R on a server and access if from a web browser. Inside Emacs The ESS (Emacs Speaks Statistics) package is an add-on for Emacs that allows you to run R directly within Emacs. For more on this project, see http://ess.r www.oxygen.com.ro Other Ways to Run R | 17 R User Interface 3A Short R Tutorial This chapter contains a short tutorial of R with a lot of examples. If you’ve never used R before, this is a great time to start it up and try playing with it. There’s no better way to learn something than by trying it yourself. You can follow along by typing in the same text that’s shown in the book. Or, try changing it a little bit to see what happens. (For example, if the sample code says “3 + 4,” try typing 3 - 4 instead.) If you’ve never used an interactive language before, take a look at Chapter 2 before you start. That chapter contains an overview of the R environment, including the console. Otherwise, you might find the presentation of the examples—and the termi- nology—confusing. Basic Operations in R Let’s get started using R. When you enter an expression into the R console and press the Enter key, R will evaluate that expression and display the results (if there are any). If the statement results in a value, R will print that value. For example, you can use R to do simple math: > 1 + 2 + 3 [1] 6 > 1 + 2 * 3 [1] 7 > (1 + 2) * 3 [1] 9 The interactive R interpreter will automatically print an object returned by an ex- pression entered into the R console. Notice the funny “[1]” that accompanies each returned value. In R, any number that you enter in the console is interpreted as a vector. A vector is an ordered collection of numbers. The “[1]” means that the index 19 of the first item displayed in the row is 1. In each of these cases, there is also only one element in the vector. You can construct longer vectors using the c() function. (c stands for “com- bine.”) For example: > c(0, 1, 1, 2, 3, 5, 8) [1] 0 1 1 2 3 5 8 is a vector that contains the first seven elements of the Fibonacci sequence. As an example of a vector that spans multiple lines, let’s use the sequence operator to produce a vector with every integer between 1 and > [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 [23] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 [45] 45 46 47 48 49 50 Notice the numbers in the brackets on the left-hand side of the results. These indicate the index of the first element shown in each row. When you perform an operation on two vectors, R will match the elements of the two vectors pairwise and return a vector. For example: > c(1, 2, 3, 4) + c(10, 20, 30, 40) [1] 11 22 33 44 > c(1, 2, 3, 4) * c(10, 20, 30, 40) [1] 10 40 90 > c(1, 2, 3, 4) - c(1, 1, 1, 1) [1] 0 1 2 3 If the two vectors aren’t the same size, R will repeat the smaller sequence multiple times: > c(1, 2, 3, 4) + 1 [1] 2 3 4 5 > 1 / c(1, 2, 3, 4, 5) [1] > c(1, 2, 3, 4) + c(10, ) [1] 11 13 > c(1, 2, 3, 4, 5) + c(10, ) [1] 11 13 15 Warning message: In c(1, 2, 3, 4, 5) + c(10, ) : longer object length is not a multiple of shorter object length Note the warning if the second sequence isn’t a multiple of the first. In R, you can also enter expressions with characters: > "Hello world." [1] "Hello world." This is called a character vector in R. This example is actually a character vector of length 1. Here is an example of a character vector of length 2: > c("Hello world", "Hello R interpreter") [1] "Hello world" "Hello R interpreter" 20 | Chapter 3: A Short R Tutorial (In other languages, like C, “character” refers to a single character, and an ordered set of characters is called a string. A string in C is equivalent to a character value in R.) You can add comments to R code. Anything after a pound sign (“#”) on a line is ignored: > # Here is an example of a comment at the beginning of a line > 1 + 2 + # and here is an example in the middle + 3 [1] 6 Functions In R, the operations that do all the work are called functions. We’ve already used a few functions above (you can’t do anything interesting in R without them). Func- tions are just like what you remember from math class. Most functions are in the following form: f(argument1, argument2, ) Where f is the name of the function, and argument1, argument2, . . . are the arguments to the function. Here are a few more examples: > exp(1) [1] > cos() [1] -1 > log2(1) [1] 0 In each of these examples, the functions took only one argument. Many functions require more than one argument. You can specify the arguments by name: > log(x=64, base=4) [1] 3 Or, if you give the arguments in the default order, you can omit the names: > log(64,4) [1] 3 Not all functions are of the form f(). Some of them are in the form of opera- tors.1 For example, we used the addition operator (“+”) above. Here are a few ex- amples of operators: > 17 + 2 [1] 19 > 2 ^ 10 [1] > 3 == 4 [1] FALSE 1. When you enter a binary or unary operator into R, the R interpreter will actually translate the operator into a function; there is a function equivalent for each operator. We’ll talk about this more in Chapter 5. Functions | 21 A Short R Tutorial We’ve seen the first one already: it’s just addition. The second operator is the ex- ponentiation operator, which is interesting because it’s not a commutative operator. The third operator is the equality operator. (Notice that the result returned is FALSE; R has a Boolean data type.) Variables Like most other languages, R lets you assign values to variables and refer to them by name. In R, the assignment operator is <-. Usually, this is pronounced as “gets.” For example, the statement: x <- 1 is usually read as “x gets 1.” (If you’ve ever done any work with theoretical computer science, you’ll probably like this notation: it looks just like algorithm pseudocode.) After you assign a value to a variable, the R interpreter will substitute that value in place of the variable name when it evaluates an expression. Here’s a simple example: > x <- 1 > y <- 2 > z <- c(x,y) > # evaluate z to see what's stored as z > z [1] 1 2 Notice that the substitution is done at the time that the value is assigned to z, not at the time that z is evaluated. Suppose that you were to type in the preceding three expressions and then change the value of y. The value of z would not change: > y <- 4 > z [1] 1 2 I’ll talk more about the subtleties of variables and how they’re evaluated in Chap- ter 8. R provides several different ways to refer to a member (or set of members) of a vector. You can refer to elements by location in a vector: > b <- c(1,2,3,4,5,6,7,8,9,10,11,12) > b [1] 1 2 3 4 5 6 7 8 9 10 11 12 > # let's fetch the 7th item in vector b > b[7] [1] 7 > # fetch items 1 through 6 > b[] [1] 1 2 3 4 5 6 > # fetch only members of b that are congruent to zero (mod 3) > # (in non-math speak, members that are multiples of 3) > b[b %% 3 == 0] [1] 3 6 9 12 You can fetch multiple items in a vector by specifying the indices of each item as an integer vector: 22 | Chapter 3: A Short R Tutorial > # fetch items 1 through 6 > b[] [1] 1 2 3 4 5 6 > # fetch 1, 6, 11 > b[c(1,6,11)] [1] 1 6 11 You can fetch items out of order. Items are returned in the order they are referenced: > b[c(8,4,9)] [1] 8 4 9 You can also specify which items to fetch through a logical vector. As an example, let’s fetch only multiples of 3 (by selecting items that are congruent to 0 mod 3): > b %% 3 == 0 [1] FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE [12] TRUE > b[b %% 3 == 0] [1] 3 6 9 12 In R, there are two additional operators that can be used for assigning values to symbols. First, you can use a single equals sign (“=”) for assignment.2 This operator assigns the symbol on the left to the object on the right. In many other languages, all assignment statements use equals signs. If you are more comfortable with this notation, you are free to use it. However, I will be using only the <- assignment operator in this book because I think it is easier to read. Whichever notation you prefer, be careful because the = operator does not mean “equals.” For that, you need to use the == operator: > one <- 1 > two <- 2 > # This means: assign the value of "two" to the variable "one" > one = two > one [1] 2 > two [1] 2 > # let's start again > one <- 1 > two <- 2 > # This means: does the value of "one" equal the value of "two" > one == two [1] FALSE In R, you can also assign an object on the left to a symbol on the right: > 3 -> three > three [1] 3 2. Note that you cannot use the <- operator when passing arguments to a function; you need to map values to argument names using the “=” symbol. Using the <- operator in a function will assign the value to the variable in the current environment and then pass the value returned to the function. This might be what you want, but it probably isn’t. Variables | 23 A Short R Tutorial In some programming contexts, this notation might help you write clearer code. (It may also be convenient if you type in a long expression and then realize that you have forgotten to assign the result to a symbol.) A function in R is just another object that is assigned to a symbol. You can define your own functions in R, assign them a name, and then call them just like the built- in functions: > f <- function(x,y) {c(x+1, y+1)} > f(1,2) [1] 2 3 This leads to a very useful trick. You can often type the name of a function to see the code for it. Here’s an example: > f function(x,y) {c(x+1, y+1)} Introduction to Data Structures In R, you can construct more complicated data structures than just vectors. An array is a multidimensional vector. Vectors and arrays are stored the same way in- ternally, but an array may be displayed differently and accessed differently. An array object is just a vector that’s associated with a dimension attribute. Here’s a simple example. First, let’s define an array explicitly: > a <- array(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), dim=c(3, 4)) Here is what the array looks like: > a [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 And here is how you reference one cell: > a[2,2] [1] 5 Now, let’s define a vector with the same contents: > v <- c(1,2,3,4,5,6,7,8,9,10,11,12) > v [1] 1 2 3 4 5 6 7 8 9 10 11 12 A matrix is just a two-dimensional array: > m <- matrix(data=c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4) > m [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 24 | Chapter 3: A Short R Tutorial Arrays can have more than two dimensions. For example: > w <- array(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18),dim=c(3,3,2)) > w , , 1 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 , , 2 [,1] [,2] [,3] [1,] 10 13 16 [2,] 11 14 17 [3,] 12 15 18 > w[1,1,1] [1] 1 R uses very clean syntax for referring to part of an array. You specify separate indices for each dimension, separated by commas: > a[1,2] [1] 4 > a[,] [,1] [,2] [1,] 1 4 [2,] 2 5 To get all rows (or columns) from a dimension, simply omit the indices: > # first row only > a[1,] [1] 1 4 7 10 > # first column only > a[,1] [1] 1 2 3 > # you can also refer to a range of rows > a[,] [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 > # you can even refer to a noncontiguous set of rows > a[c(1,3),] [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 3 6 9 12 In all the examples above, we’ve just looked at data structures based on a single underlying data type. In R, it’s possible to construct more complicated structures with multiple data types. R has a built-in data type for mixing objects of different types, called lists. Lists in R are subtly different from lists in many other languages. Lists in R may contain a heterogeneous selection of objects. You can name each component in a list. Items in a list may be referred to by either location or name. Introduction to Data Structures | 25 A Short R Tutorial Here is an example of a list with two named components: > # a list containing two strings > e <- list(thing="hat", size="") > e $thing [1] "hat" $size [1] "" You may access an item in the list in multiple ways: > e$thing [1] "hat" > e[1] $thing [1] "hat" > e[[1]] [1] "hat" A list can even contain other lists: > g <- list("this list references another list", e) > g [[1]] [1] "this list references another list" [[2]] [[2]]$thing [1] "hat" [[2]]$size [1] "" A data frame is a list that contains multiple named vectors that are the same length. A data frame is a lot like a spreadsheet or a database table. Data frames are partic- ularly good for representing experimental data. As an example, I’m going to use some baseball data. Let’s construct a data frame with the win/loss results in the National League (NL) East in > teams <- c("PHI","NYM","FLA","ATL","WSN") > w <- c(92, 89, 94, 72, 59) > l <- c(70, 73, 77, 90, ) > nleast <- www.oxygen.com.ro(teams,w,l) > nleast teams w l 1 PHI 92 70 2 NYM 89 73 3 FLA 94 77 4 ATL 72 90 5 WSN 59 26 | Chapter 3: A Short R Tutorial You can refer to the components of a data frame (or items in a list) by name using the $ operator: > nleast$w [1] 92 89 94 72 59 Here’s one way to find a specific value in a data frame. Suppose that you wanted to find the number of losses by the Florida Marlins (FLA). One way to select a member of an array is by using a vector of Boolean values to specify which item to return from a list. You can calculate an appropriate vector like this: > nleast$teams=="FLA" [1] FALSE FALSE TRUE FALSE FALSE Then you can use this vector to refer to the right element in the losses vector: > nleast$l[nleast$teams=="FLA"] [1] 77 You can import data into R from another file or from a database. See Chapter 11 for more information on how to do this. In addition to lists, R has other types of data structures for holding a heterogeneous collection of objects, including formal class definitions through S4 objects. Objects and Classes R is an object-oriented language. Every object in R has a type. Additionally, every object in R is a member of a class. We have already encountered several different classes: character vectors, numeric vectors, data frames, lists, and arrays. You can use the class function to determine the class of an object. For example: > class(teams) [1] "character" > class(w) [1] "numeric" > class(nleast) [1] "www.oxygen.com.ro" > class(class) [1] "function" Notice the last example: a function is an object in R with the class function. Some functions are associated with a specific class. These are called methods. (Not all functions are tied closely to a particular class; the class system in R is much less formal than that in a language like Java.) In R, methods for different classes can share the same name. These are called generic functions. Generic functions serve two purposes. First, they make it easy to guess the right function name for an unfamiliar class. Second, generic functions make it possible to use the same code for objects of different types. Objects and Classes | 27 A Short R Tutorial For example, + is a generic function for adding objects. You can add numbers to- gether with the + operator: > 17 + 6 [1] 23 You might guess that the addition operator would work similarly with other types of objects. For example, you can also use the + operator with a date object and a number: > www.oxygen.com.ro("") + 7 [1] "" By the way, the R interpreter calls the generic function print on any object returned on the R console. Suppose that you define x as: > x <- 1 + 2 + 3 + 4 When you type: > x [1] 10 the interpreter actually calls the function print(x) to print the results. This means that if you define a new class, you can define a print method to specify how objects from that new class are printed on the console. Some functions take advantage of this functionality to do other things when you enter an expression on the console.3 I’ll talk about objects in more depth in Chapter 7 and classes in Chapter Models and Formulas To statisticians, a model is a concise way to describe a set of data, usually with a mathematical formula. Sometimes, the goal is to build a predictive model with training data to predict values based on other data. Other times, the goal is to build a descriptive model that helps you understand the data better. R has a special notation for describing relationships between variables. Suppose that you are assuming a linear model for a variable y, predicted from the variables x1, x2, , xn. (Statisticians usually refer to y as the dependent variable, and x1, x2, , xn as the independent variables.) In equation form, this implies a relationship like: In R, you would write the relationship as y ~ x1 + x2 + + xn, which is a formula object. 3. A very important example of this is lattice graphics. Plotting functions in the lattice library return lattice objects but don’t plot results. If you call a lattice function on the R console, the console will print the object, thus plotting the results. However, if you call a lattice function within another function, or in a script, R will not plot the results unless you explicitly print the lattice object. 28 | Chapter 3: A Short R Tutorial As an example, let’s use the cars data set (which is included in the base package). This data set was created during the s and shows the speed and stopping dis- tance for a set of different cars. We’ll look at the relationship between speed and stopping distance. We’ll assume that the stopping distance is a linear function of speed. So let’s try to use a linear regression to estimate the relationship. The formula is dist~speed. We’ll use the lm function to estimate the parameters of a linear model. The lm function returns an object of class lm, which we will assign to a variable called www.oxygen.com.ro: > www.oxygen.com.ro <- lm(formula=dist~speed,data=cars) Now, let’s take a quick look at the results returned: > www.oxygen.com.ro Call: lm(formula = dist ~ speed, data = cars) Coefficients: (Intercept) speed As you can see, printing an lm object shows you the original function call (and thus the data set and formula) and the estimated coefficients. For some more information, we can use the summary function: > summary(www.oxygen.com.ro) Call: lm(formula = dist ~ speed, data = cars) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) * speed e *** Signif. codes: 0 '***' '**' '*' '.' ' ' 1 Residual standard error: on 48 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 48 DF, p-value: e As you can see, the summary option shows you the function call, the distribution of the residuals from the fit, the coefficients, and information about the fit. By the way, it is possible to simply call the lm function or to call summary(lm()) and not assign a name to the model object: > lm(dist~speed,data=cars) Call: lm(formula = dist ~ speed, data = cars) Models and Formulas | 29 A Short R Tutorial Coefficients: (Intercept) speed > summary(lm(dist~speed,data=cars)) Call: lm(formula = dist ~ speed, data = cars) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) * speed e *** Signif. codes: 0 ‘***’ ‘**’ ‘*’ ‘.’ ‘ ’ 1 Residual standard error: on 48 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 48 DF, p-value: e In some cases, this can be more convenient. However, you often want to perform additional analyses, such as plotting residuals, calculating additional statistics, or updating a model to add or subtract variables. By assigning a name to the model, you can make your code easier to understand and modify. Additionally, refitting a model can be very time consuming for complex models and large data sets. By as- signing the model to a variable name, you can avoid these problems. Charts and Graphics R includes several packages for visualizing data: graphics, grid, and lattice. Usu- ally, you’ll find that functions within the graphics and lattice packages are the most useful.4 If you’re familiar with Microsoft Excel, you’ll find that R can generate all of the charts that you’re familiar with: column charts, bar charts, line plots, pie charts, and scatter plots. Even if that’s all you need, R makes it much easier than Excel to automate the creation of charts and to customize them. However, there are many, many more types of charts available in R, many of them quite intuitive and elegant. To make this a little more interesting, let’s work with some real data. We’re going to look at all field goal attempts in the National Football League (NFL) in For those of you who aren’t familiar with American football, here’s a quick explan- ation. A team can attempt to kick a football between a set of goalposts to receive 3 4. Other packages are available for visualizing data. For example, the RGobi package provides tools for creating interactive graphics. 5. The data was provided by Aaron Schatz of Pro Football Prospectus. For more information, see the Football Outsiders website at www.oxygen.com.ro, or you can find its annual books at most bookstores—both online and “brick and mortar.” 30 | Chapter 3: A Short R Tutorial points. If it misses the field goal, possession of the ball reverts to the other team (at the spot on the field where the kick was attempted). We’re going to take a look at kick attempts in the NFL in First, let’s take a quick look at the distribution of distances. R provides a function, hist, that can do this quickly for us. Let’s start by loading the appropriate data set. (The data set is included in the nutshell package; see the Preface for information on how to obtain this package.) > library(nutshell) > data(www.oxygen.com.ro) Let’s take a quick look at the names of the columns in the www.oxygen.com.ro data frame: > names(www.oxygen.com.ro) [1] "www.oxygen.com.ro" "week" "qtr" "www.oxygen.com.ro" [5] "offense" "defense" "www.oxygen.com.ro" "player" [9] "yards" "www.oxygen.com.ro" Now, let’s just try the hist command: > hist(www.oxygen.com.ro$yards) This produces a chart like the one shown in Figure (Depending on your system, if you try this yourself, you may see a differently colored and formatted chart. I tweaked a few graphical parameters so the charts would look good in print.) I wanted to see more detail about the number of field goals at different distances, so I modified the breaks argument to add more bins to the histogram: > hist(www.oxygen.com.ro$yards, breaks=35) Figure Histogram of field goal attempts with default settings Charts and Graphics | 31 A Short R Tutorial You can see the results of this command in Figure R also features many other ways to visualize data. A great example is a strip chart. This chart just plots one point on the x-axis for every point in a vector. As an example, let’s look at the distance of blocked field goals. We can distinguish blocked field goals with the www.oxygen.com.ro variable in the www.oxygen.com.ro data frame. Let’s take a quick look at how many blocked field goals there were in We’ll use the table function to tabulate the results: > table(www.oxygen.com.ro$www.oxygen.com.ro) FG aborted FG blocked FG good FG no 8 24 Figure Histogram of field goal distances, showing more bins Now we’ll select only observations with blocked field goals. We’ll add a little jitter so we can see individual points. Finally, we will also change the appearance of the points using the pch argument: > stripchart(www.oxygen.com.ro[www.oxygen.com.ro$www.oxygen.com.ro=="FG blocked",]$yards, + pch=19, method="jitter") The results are shown in Figure As a second example, let’s use the cars data set, which is included in the base pack- age. The cars data set consists of a set of 50 observations: > data(cars) > dim(cars) [1] 50 2 > names(cars) [1] "speed" "dist" 32 | Chapter 3: A Short R Tutorial Each observation contains the speed of the car and the distance required to stop. Let’s take a quick look at the contents of this data set: > summary(cars) speed dist Min. : Min. : 1st Qu 1st Qu.: Median Median : Mean Mean : 3rd Qu 3rd Qu.: Max. Max. Let’s plot the relationship between vehicle speed and stopping distance: > plot(cars, xlab = "Speed (mph)", ylab = "Stopping distance (ft)", + las = 1, xlim = c(0, 25)) The plot is shown in Figure At a quick glance, we see that stopping distance is roughly proportional to speed. Figure Plot of data in the cars data set Figure Strip chart showing field goal attempt distances Charts and Graphics | 33 A Short R Tutorial Let’s try one more example, this time using lattice graphics. Lattice graphics provide some great tools for drawing pretty charts, particularly charts that compare different groups of points. By default, the lattice package is not loaded; you will get an error if you try calling a lattice function without loading the library. To load the library, use the following command: > library(lattice) We will talk more about R packages in Chapter 4. For example data, we’ll look at how American eating habits changed between and The consumption data set is available in the nutshell package. It contains 48 ob- servations, each showing the amount of a commodity consumed (or produced) in a specific year. Data is available only for years that are multiples of 5 (so there are six unique years between and ). The amount of food consumed is given by Amount, the type of food is given by Food, and the year is given by Year. Two of the variables are numeric vectors: Amount and Year. However, two of them are an important data type that we haven’t seen yet: factors. A factor is an R object type that is used to compactly represent a vector of categorical values. Factors are used in many modeling functions. You can create a factor from another vector (typ- ically a character vector) using the factor function. In this data frame, the values Food and Units are factors. (We’ll discuss vectors in more detail in “Vec- tors” on page ) To help reveal trends in the data, I decided to use the dotplot function. (This function resembles line charts in Excel.) Specifically, we’d like to look at how the Amount varies by Year. We’d like to separately plot the trend for each value of the Food variable. For lattice graphics, we specify the data that we want to plot through a formula, in this case, Amount ~ Year | Food. A formula is an R object that is used to express a relationship between a set of variables. If you’d like, you can try plotting the relationship using the default settings: > library(nutshell) > library(lattice) > data(consumption) > dotplot(Amount~Year|Food, consumption) I found the default plot hard to read: the axis labels were too big, the scale for each plot was the same, and the stacking didn’t look right to me. So I tuned the presen- tation a little bit. Here is the version that produced Figure > dotplot(Amount ~ Year | Food,data=consumption, + aspect="xy",scales=list(relation="sliced", cex=.4)) 6. I obtained the data from the Statistical Abstract of the United States, a terrific book of data about the United States that is published by the Census Bureau. I took a subset of the data, keeping consumption for only the largest categories. You can find this data at http://www www.oxygen.com.ro 34 | Chapter 3: A Short R Tutorial Figure Lattice plot showing American changes in American eating habits, – The aspect option changes the aspect ratios of each plot to try to show changes from 45° angles (making changes easier to see). The scales option changes how the axes are drawn. I’ll discuss lattice plots in more detail in Chapter 14, explaining how to use different options to tune the look of your charts. Getting Help R includes a help system to help you get information about installed packages. To get help on a function, for example glm, you would type: > help(glm) or, equivalently: > ?glm To search for help on an operator, you need to place the operator in backquotes: > ?`+` Getting Help | 35 A Short R Tutorial If you’d like to try the examples in a help file, you can use the example function to automatically try them. For example, to see the example for glm, type: > example(glm) You can search for help on a topic, for example “regression,” using the www.oxygen.com.ro function: > www.oxygen.com.ro("regression") This can be very helpful if you can’t remember the name of a function; R will return a list of relevant topics. There is a shorthand for this command as well: > ??regression To get the help file for a package, you can sometimes use one of the commands above. However, you can also use the help option for the library command to get more complete information. For example, to get help on the grDevices library, you would use the following function: > library(help="grDevices") Some packages (especially packages from Bioconductor) include at least one vignette. A vignette is a short document that describes how to use the package, com- plete with examples. You can view a vignette using the vignette command. For example, to view the vignette for the affy package (assuming that you have installed this package), you would use the following command: > vignette("affy") To view available vignettes for all attached packages, you can use the following command: > vignette(all=FALSE) To view vignettes for all installed packages, try this command: > vignette(all=TRUE) 36 | Chapter 3: A Short R Tutorial 4R Packages A package is a related set of functions, help files, and data files that have been bundled together. Packages in R are similar to modules in Perl, libraries in C/C++, and classes in Java. Typically, all the functions in the package are related: for example, the stats package contains functions for doing statistical analysis. To use a package, you need to load it into R (see “Loading Packages” on page 40 for directions on loading packages). R offers an enormous number of packages: packages that display graphics, packages for performing statistical tests, and packages for trying the latest machine learning techniques. There are also packages designed for a wide variety of industries and applications: packages for analyzing microarray data, packages for modeling credit risks, and packages for social sciences. Some of these packages are included with R: you just have to tell R that you want to use them. Other packages are available from public package repositories. You can even make your own packages. This chapter explains how to use packages. An Overview of Packages To use a package in R, you first need to make sure that it has been installed into a local library.1 By default, packages are read from one system-level library, but you can add additional libraries. Next, you need to load the packages into your current session. You might be won- dering why you need to load packages into R in order to use them. First, R’s help system slows down significantly when you add more packages to search. (I know this from personal experience: I loaded dozens of packages into R while writing this book, and the help system slowed to a crawl.) Second, it’s possible that two packages have objects with the same name. If every package were loaded into R by default, you might think you were using one function but really be using another. Even 1. If you’re a C/C++ programmer, don’t get confused; “library” means something different in R. 37 worse, it’s possible for there to be internal conflicts: two packages may both use functions with names like “fit” that work very differently, resulting in strange and unexpected results. By loading only packages that you need, you can minimize the chance of these conflicts. Listing Packages in Local Libraries To get the list of packages loaded by default, you can use the getOption command to check the value of the defaultPackages value: > getOption("defaultPackages") [1] "datasets" "utils" "grDevices" "graphics" "stats" [6] "methods" This command omits the base package; the base package implements many key features of the R language and is always loaded. If you would like to see the list of currently loaded packages, you can use the .packages command (note the parentheses around the outside): > (.packages()) [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7] "base" To show all packages available, you can use the www.oxygen.com.roble option with the packages command: > (.packages(www.oxygen.com.roble=TRUE)) [1] "KernSmooth" "MASS" "base" "bitops" "boot" [6] "class" "cluster" "codetools" "datasets" "foreign" [11] "grDevices" "graphics" "grid" "hexbin" "lattice" [16] "maps" "methods" "mgcv" "nlme" "nnet" [21] "rpart" "spatial" "splines" "stats" "stats4" [26] "survival" "tcltk" "tools" "utils" You can also enter the library() command with no arguments, and a new window will pop up showing you the set of available packages. Included Packages R comes with a number of different packages (see Table for a list). Some of these packages (like base, graphics, grDevices, methods, and utils) implement basic features of the R language or R environment. Other packages provide com- monly used statistical modeling tools (like cluster, nnet, and stats). Other pack- ages implement sophisticated graphics (grid and lattice), contain examples (datasets), or contain other frequently used functions. In many cases, you won’t need to get any other packages. 38 | Chapter 4: R Packages Table rpart Tools for building recursive partitioning and regression tree models spatial Functions for Kriging and point pattern analysis splines Regression spline functions and classes stats ✓ Functions for statistics calculations and random number generation; includes many common statistical tests, prob- ability distributions, and modeling tools stats4 Statistics functions as S4 methods and classes Listing Packages in Local Libraries | 39 R Packages Package name Loaded by default Description survival Survival analysis functions tcltk Interface to Tcl/Tk; used to create platform-independent UI tools tools Tools for developing packages utils ✓ A variety of utility functions for R, including package man- agement, file reading and writing, and editing Loading Packages By default, not all packages are loaded into R. If you try to use a function from a package that hasn’t been loaded, you’ll get an error: > # try to use rpart before loading it > fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis) Error: could not find function "rpart" To load a package in R, you can use the library() command. For example, to load the package rpart (which contains functions for building recursive partition trees), you would use the following command: > library(rpart) (There is a similar command, require(), that takes slightly different arguments. For more about require, see the R help files.) If you’re more comfortable using a GUI, you can browse for packages and load them using the GUI. If you choose to use this interface to find packages, make sure that you include the appropriate library command with your scripts to prevent errors later. Loading Packages on Windows and Linux On Microsoft Windows, you can use the library function to load packages. Alter- natively, you can select “Load package” from the Packages menu in the GUI. This will bring up a window showing a list of packages that you can choose to load. Loading Packages on Mac OS X The Mac OS X R environment is a little fancier than the other versions. Like the other versions, you can use the library() function. Otherwise, you can select Package Manager from the “Packages & Data” menu. The Package Manager UI, as shown in Figure , lets you see which packages are loaded, load packages, and even browse the help file for a package. 40 | Chapter 4: R Packages Exploring Package Repositories You can find thousands of R packages online. The two biggest sources of packages are CRAN (Comprehensive R Archive Network) and Bioconductor, but some packages are available elsewhere. (If you know Perl, you’ll notice that CRAN is very similar to CPAN, the Comprehensive Perl Archive Network.) CRAN is hosted by the R Foundation (the same nonprofit organization that oversees R development). The archive contains a very large number of packages (there were 1, packages on February 24, ), covering a wide number of different applications. CRAN is hosted on a set of mirror sites around the world. Try to pick an archive site near you: you’ll minimize download times and help reduce the server load on the R Founda- tion. Bioconductor is an open-source project for building tools to analyze genomic data. Bioconductor tools are built using R and are distributed as R packages. The Bio- conductor packages are distributed separately from R, and most are not available on CRAN. There are dozens of different packages available directly through the Bioconductor project. Figure Mac OS X Package Manager Exploring Package Repositories | 41 R Packages R-Forge is another interesting place to look for packages. The R-Forge site contains projects that are in progress, and it provides tools for developers to collaborate. You may find some interesting packages on this site, but please be sure to read the dis- claimers and documentation, because many of these packages are works in progress. R includes the ability to download and install packages from other repositories. However, I don’t know of other public repositories for R packages. Most R projects simply use CRAN to host their packages. (I’ve even seen some books that use CRAN to distribute sample code and sample data.) Exploring R Package Repositories on the Web R provides good tools for installing packages within the GUI but doesn’t provide a good way to find a specific package. Luckily, it’s pretty easy to find a package on the Web. You can browse through the set of available packages with your web browser. Here are some places to look for packages. Repository URL CRAN See www.oxygen.com.ro for an authoritative list, but you should try to find your local mirror and use that site instead Bioconductor www.oxygen.com.ro R-Forge www.oxygen.com.ro However, you can also try to find packages with a search engine. I’ve had good luck finding packages by using Google to search for “R package” plus the name of the application. For example, searching for “R package multivariate additive regression splines” can help you find the mda package, which contains the mars function. (Of course, I discovered later that the earth package is a better choice for this algorithm, but we’ll get to that later.) Finding and Installing Packages Inside R Once you figure out what package you want to install, the easiest way to do it is inside R. Windows and Linux GUIs Installing packages through the Windows GUI is pretty straightforward. 1. (Optional) By default, R is set to fetch packages from the “CRAN” and “CRAN (extra)” categories. To pick additional sets of packages, choose “Select reposi- tories” from the Packages menu. You can choose multiple repositories. 2. From the Packages menu, select “Install package(s)”. 42 | Chapter 4: R Packages 3. If this is the first time you are installing a package during this session, R will ask you to pick a mirror. (You’ll probably want to pick a site that is geographically close, because it’s likely to also be close on the Internet, and thus fast.) 4. Click the name of the package that you want to install and press OK. R will download and install the packages that you have selected. Note that you may run into issues installing packages, depending on the permissions assigned to your user account. If you are using Windows XP, and your account is a member of the Administrators group, you should have no problems. If you are using Windows Vista, and you installed R in your own directory, you should have no issues. Otherwise, you may need to run R as an Administrator in order to install supplementary packages. Mac OS X GUI On Mac OS X, there is a slightly different user interface for package installation. It shows a little more information than the Windows version, but it’s a little more confusing to use. 1. From the Package and Data menu, select Package Installer. (See Figure for a picture of the installer window.) 2. (Optional) In the top-left corner of the window is a menu that allows you to select the category of packages you would like to download. Initially, this is set to “CRAN (binaries).” 3. Click the Get List button to display the available set of packages. 4. You can use the search box to filter the list to show only packages that match the name you are looking for. (Note: you have to click the Get List button before the search will return results.) 5. Select the set of packages that you want to install and press the Install Selected button. By default, R will install packages at the system level, making them available to all users. If you do not have the appropriate permissions to install packages globally, or if you would like to install them elsewhere, then select an alternative location. Additionally, R will not install the additional packages on which your packages depend. You will get an error if you try to load a package and have not installed other packages on which it is dependent. R console You can also install R packages directly from the R console. Table shows the set of commands for installing packages from the console. As a simple example, suppose that you wanted to install the packages tree and maptree. You could accomplish this with the following command: > www.oxygen.com.roes(c("tree","maptree")) trying URL 'www.oxygen.com.ro /tree_tgz' Content type 'application/x-gzip' length bytes ( Kb) Exploring Package Repositories | 43 R Packages opened URL ================================================== downloaded Kb trying URL 'www.oxygen.com.ro /maptree_tgz' Content type 'application/x-gzip' length bytes (99 Kb) opened URL ================================================== downloaded 99 Kb The downloaded packages are in /var/folders/gj/gj60srEiEVq4hTWB5lvMak+++TM/-Tmp-//RtmpIXUWDu/ downloaded_packages This will install the packages to the default library specified by the variable .Library. If you’d like to remove these packages after you’re done, you can use www.oxygen.com.roes. You need to specify the library where the packages were installed: >
Источник: [www.oxygen.com.ro]
.

Jio phone mein my talking tom friends kaise download karen - Talking Tom friends in Jio phone - TKK

Download talking tom trashix