[ Index ] |
PHP Cross Reference of Unnamed Project |
[Summary view] [Print] [Text view]
1 =head1 NAME 2 3 perlreftut - Mark's very short tutorial about references 4 5 =head1 DESCRIPTION 6 7 One of the most important new features in Perl 5 was the capability to 8 manage complicated data structures like multidimensional arrays and 9 nested hashes. To enable these, Perl 5 introduced a feature called 10 `references', and using references is the key to managing complicated, 11 structured data in Perl. Unfortunately, there's a lot of funny syntax 12 to learn, and the main manual page can be hard to follow. The manual 13 is quite complete, and sometimes people find that a problem, because 14 it can be hard to tell what is important and what isn't. 15 16 Fortunately, you only need to know 10% of what's in the main page to get 17 90% of the benefit. This page will show you that 10%. 18 19 =head1 Who Needs Complicated Data Structures? 20 21 One problem that came up all the time in Perl 4 was how to represent a 22 hash whose values were lists. Perl 4 had hashes, of course, but the 23 values had to be scalars; they couldn't be lists. 24 25 Why would you want a hash of lists? Let's take a simple example: You 26 have a file of city and country names, like this: 27 28 Chicago, USA 29 Frankfurt, Germany 30 Berlin, Germany 31 Washington, USA 32 Helsinki, Finland 33 New York, USA 34 35 and you want to produce an output like this, with each country mentioned 36 once, and then an alphabetical list of the cities in that country: 37 38 Finland: Helsinki. 39 Germany: Berlin, Frankfurt. 40 USA: Chicago, New York, Washington. 41 42 The natural way to do this is to have a hash whose keys are country 43 names. Associated with each country name key is a list of the cities in 44 that country. Each time you read a line of input, split it into a country 45 and a city, look up the list of cities already known to be in that 46 country, and append the new city to the list. When you're done reading 47 the input, iterate over the hash as usual, sorting each list of cities 48 before you print it out. 49 50 If hash values can't be lists, you lose. In Perl 4, hash values can't 51 be lists; they can only be strings. You lose. You'd probably have to 52 combine all the cities into a single string somehow, and then when 53 time came to write the output, you'd have to break the string into a 54 list, sort the list, and turn it back into a string. This is messy 55 and error-prone. And it's frustrating, because Perl already has 56 perfectly good lists that would solve the problem if only you could 57 use them. 58 59 =head1 The Solution 60 61 By the time Perl 5 rolled around, we were already stuck with this 62 design: Hash values must be scalars. The solution to this is 63 references. 64 65 A reference is a scalar value that I<refers to> an entire array or an 66 entire hash (or to just about anything else). Names are one kind of 67 reference that you're already familiar with. Think of the President 68 of the United States: a messy, inconvenient bag of blood and bones. 69 But to talk about him, or to represent him in a computer program, all 70 you need is the easy, convenient scalar string "George Bush". 71 72 References in Perl are like names for arrays and hashes. They're 73 Perl's private, internal names, so you can be sure they're 74 unambiguous. Unlike "George Bush", a reference only refers to one 75 thing, and you always know what it refers to. If you have a reference 76 to an array, you can recover the entire array from it. If you have a 77 reference to a hash, you can recover the entire hash. But the 78 reference is still an easy, compact scalar value. 79 80 You can't have a hash whose values are arrays; hash values can only be 81 scalars. We're stuck with that. But a single reference can refer to 82 an entire array, and references are scalars, so you can have a hash of 83 references to arrays, and it'll act a lot like a hash of arrays, and 84 it'll be just as useful as a hash of arrays. 85 86 We'll come back to this city-country problem later, after we've seen 87 some syntax for managing references. 88 89 90 =head1 Syntax 91 92 There are just two ways to make a reference, and just two ways to use 93 it once you have it. 94 95 =head2 Making References 96 97 =head3 B<Make Rule 1> 98 99 If you put a C<\> in front of a variable, you get a 100 reference to that variable. 101 102 $aref = \@array; # $aref now holds a reference to @array 103 $href = \%hash; # $href now holds a reference to %hash 104 $sref = \$scalar; # $sref now holds a reference to $scalar 105 106 Once the reference is stored in a variable like $aref or $href, you 107 can copy it or store it just the same as any other scalar value: 108 109 $xy = $aref; # $xy now holds a reference to @array 110 $p[3] = $href; # $p[3] now holds a reference to %hash 111 $z = $p[3]; # $z now holds a reference to %hash 112 113 114 These examples show how to make references to variables with names. 115 Sometimes you want to make an array or a hash that doesn't have a 116 name. This is analogous to the way you like to be able to use the 117 string C<"\n"> or the number 80 without having to store it in a named 118 variable first. 119 120 B<Make Rule 2> 121 122 C<[ ITEMS ]> makes a new, anonymous array, and returns a reference to 123 that array. C<{ ITEMS }> makes a new, anonymous hash, and returns a 124 reference to that hash. 125 126 $aref = [ 1, "foo", undef, 13 ]; 127 # $aref now holds a reference to an array 128 129 $href = { APR => 4, AUG => 8 }; 130 # $href now holds a reference to a hash 131 132 133 The references you get from rule 2 are the same kind of 134 references that you get from rule 1: 135 136 # This: 137 $aref = [ 1, 2, 3 ]; 138 139 # Does the same as this: 140 @array = (1, 2, 3); 141 $aref = \@array; 142 143 144 The first line is an abbreviation for the following two lines, except 145 that it doesn't create the superfluous array variable C<@array>. 146 147 If you write just C<[]>, you get a new, empty anonymous array. 148 If you write just C<{}>, you get a new, empty anonymous hash. 149 150 151 =head2 Using References 152 153 What can you do with a reference once you have it? It's a scalar 154 value, and we've seen that you can store it as a scalar and get it back 155 again just like any scalar. There are just two more ways to use it: 156 157 =head3 B<Use Rule 1> 158 159 You can always use an array reference, in curly braces, in place of 160 the name of an array. For example, C<@{$aref}> instead of C<@array>. 161 162 Here are some examples of that: 163 164 Arrays: 165 166 167 @a @{$aref} An array 168 reverse @a reverse @{$aref} Reverse the array 169 $a[3] ${$aref}[3] An element of the array 170 $a[3] = 17; ${$aref}[3] = 17 Assigning an element 171 172 173 On each line are two expressions that do the same thing. The 174 left-hand versions operate on the array C<@a>. The right-hand 175 versions operate on the array that is referred to by C<$aref>. Once 176 they find the array they're operating on, both versions do the same 177 things to the arrays. 178 179 Using a hash reference is I<exactly> the same: 180 181 %h %{$href} A hash 182 keys %h keys %{$href} Get the keys from the hash 183 $h{'red'} ${$href}{'red'} An element of the hash 184 $h{'red'} = 17 ${$href}{'red'} = 17 Assigning an element 185 186 Whatever you want to do with a reference, B<Use Rule 1> tells you how 187 to do it. You just write the Perl code that you would have written 188 for doing the same thing to a regular array or hash, and then replace 189 the array or hash name with C<{$reference}>. "How do I loop over an 190 array when all I have is a reference?" Well, to loop over an array, you 191 would write 192 193 for my $element (@array) { 194 ... 195 } 196 197 so replace the array name, C<@array>, with the reference: 198 199 for my $element (@{$aref}) { 200 ... 201 } 202 203 "How do I print out the contents of a hash when all I have is a 204 reference?" First write the code for printing out a hash: 205 206 for my $key (keys %hash) { 207 print "$key => $hash{$key}\n"; 208 } 209 210 And then replace the hash name with the reference: 211 212 for my $key (keys %{$href}) { 213 print "$key => ${$href}{$key}\n"; 214 } 215 216 =head3 B<Use Rule 2> 217 218 B<Use Rule 1> is all you really need, because it tells you how to do 219 absolutely everything you ever need to do with references. But the 220 most common thing to do with an array or a hash is to extract a single 221 element, and the B<Use Rule 1> notation is cumbersome. So there is an 222 abbreviation. 223 224 C<${$aref}[3]> is too hard to read, so you can write C<< $aref->[3] >> 225 instead. 226 227 C<${$href}{red}> is too hard to read, so you can write 228 C<< $href->{red} >> instead. 229 230 If C<$aref> holds a reference to an array, then C<< $aref->[3] >> is 231 the fourth element of the array. Don't confuse this with C<$aref[3]>, 232 which is the fourth element of a totally different array, one 233 deceptively named C<@aref>. C<$aref> and C<@aref> are unrelated the 234 same way that C<$item> and C<@item> are. 235 236 Similarly, C<< $href->{'red'} >> is part of the hash referred to by 237 the scalar variable C<$href>, perhaps even one with no name. 238 C<$href{'red'}> is part of the deceptively named C<%href> hash. It's 239 easy to forget to leave out the C<< -> >>, and if you do, you'll get 240 bizarre results when your program gets array and hash elements out of 241 totally unexpected hashes and arrays that weren't the ones you wanted 242 to use. 243 244 245 =head2 An Example 246 247 Let's see a quick example of how all this is useful. 248 249 First, remember that C<[1, 2, 3]> makes an anonymous array containing 250 C<(1, 2, 3)>, and gives you a reference to that array. 251 252 Now think about 253 254 @a = ( [1, 2, 3], 255 [4, 5, 6], 256 [7, 8, 9] 257 ); 258 259 @a is an array with three elements, and each one is a reference to 260 another array. 261 262 C<$a[1]> is one of these references. It refers to an array, the array 263 containing C<(4, 5, 6)>, and because it is a reference to an array, 264 B<Use Rule 2> says that we can write C<< $a[1]->[2] >> to get the 265 third element from that array. C<< $a[1]->[2] >> is the 6. 266 Similarly, C<< $a[0]->[1] >> is the 2. What we have here is like a 267 two-dimensional array; you can write C<< $a[ROW]->[COLUMN] >> to get 268 or set the element in any row and any column of the array. 269 270 The notation still looks a little cumbersome, so there's one more 271 abbreviation: 272 273 =head2 Arrow Rule 274 275 In between two B<subscripts>, the arrow is optional. 276 277 Instead of C<< $a[1]->[2] >>, we can write C<$a[1][2]>; it means the 278 same thing. Instead of C<< $a[0]->[1] = 23 >>, we can write 279 C<$a[0][1] = 23>; it means the same thing. 280 281 Now it really looks like two-dimensional arrays! 282 283 You can see why the arrows are important. Without them, we would have 284 had to write C<${$a[1]}[2]> instead of C<$a[1][2]>. For 285 three-dimensional arrays, they let us write C<$x[2][3][5]> instead of 286 the unreadable C<${${$x[2]}[3]}[5]>. 287 288 =head1 Solution 289 290 Here's the answer to the problem I posed earlier, of reformatting a 291 file of city and country names. 292 293 1 my %table; 294 295 2 while (<>) { 296 3 chomp; 297 4 my ($city, $country) = split /, /; 298 5 $table{$country} = [] unless exists $table{$country}; 299 6 push @{$table{$country}}, $city; 300 7 } 301 302 8 foreach $country (sort keys %table) { 303 9 print "$country: "; 304 10 my @cities = @{$table{$country}}; 305 11 print join ', ', sort @cities; 306 12 print ".\n"; 307 13 } 308 309 310 The program has two pieces: Lines 2--7 read the input and build a data 311 structure, and lines 8-13 analyze the data and print out the report. 312 We're going to have a hash, C<%table>, whose keys are country names, 313 and whose values are references to arrays of city names. The data 314 structure will look like this: 315 316 317 %table 318 +-------+---+ 319 | | | +-----------+--------+ 320 |Germany| *---->| Frankfurt | Berlin | 321 | | | +-----------+--------+ 322 +-------+---+ 323 | | | +----------+ 324 |Finland| *---->| Helsinki | 325 | | | +----------+ 326 +-------+---+ 327 | | | +---------+------------+----------+ 328 | USA | *---->| Chicago | Washington | New York | 329 | | | +---------+------------+----------+ 330 +-------+---+ 331 332 We'll look at output first. Supposing we already have this structure, 333 how do we print it out? 334 335 8 foreach $country (sort keys %table) { 336 9 print "$country: "; 337 10 my @cities = @{$table{$country}}; 338 11 print join ', ', sort @cities; 339 12 print ".\n"; 340 13 } 341 342 C<%table> is an 343 ordinary hash, and we get a list of keys from it, sort the keys, and 344 loop over the keys as usual. The only use of references is in line 10. 345 C<$table{$country}> looks up the key C<$country> in the hash 346 and gets the value, which is a reference to an array of cities in that country. 347 B<Use Rule 1> says that 348 we can recover the array by saying 349 C<@{$table{$country}}>. Line 10 is just like 350 351 @cities = @array; 352 353 except that the name C<array> has been replaced by the reference 354 C<{$table{$country}}>. The C<@> tells Perl to get the entire array. 355 Having gotten the list of cities, we sort it, join it, and print it 356 out as usual. 357 358 Lines 2-7 are responsible for building the structure in the first 359 place. Here they are again: 360 361 2 while (<>) { 362 3 chomp; 363 4 my ($city, $country) = split /, /; 364 5 $table{$country} = [] unless exists $table{$country}; 365 6 push @{$table{$country}}, $city; 366 7 } 367 368 Lines 2-4 acquire a city and country name. Line 5 looks to see if the 369 country is already present as a key in the hash. If it's not, the 370 program uses the C<[]> notation (B<Make Rule 2>) to manufacture a new, 371 empty anonymous array of cities, and installs a reference to it into 372 the hash under the appropriate key. 373 374 Line 6 installs the city name into the appropriate array. 375 C<$table{$country}> now holds a reference to the array of cities seen 376 in that country so far. Line 6 is exactly like 377 378 push @array, $city; 379 380 except that the name C<array> has been replaced by the reference 381 C<{$table{$country}}>. The C<push> adds a city name to the end of the 382 referred-to array. 383 384 There's one fine point I skipped. Line 5 is unnecessary, and we can 385 get rid of it. 386 387 2 while (<>) { 388 3 chomp; 389 4 my ($city, $country) = split /, /; 390 5 #### $table{$country} = [] unless exists $table{$country}; 391 6 push @{$table{$country}}, $city; 392 7 } 393 394 If there's already an entry in C<%table> for the current C<$country>, 395 then nothing is different. Line 6 will locate the value in 396 C<$table{$country}>, which is a reference to an array, and push 397 C<$city> into the array. But 398 what does it do when 399 C<$country> holds a key, say C<Greece>, that is not yet in C<%table>? 400 401 This is Perl, so it does the exact right thing. It sees that you want 402 to push C<Athens> onto an array that doesn't exist, so it helpfully 403 makes a new, empty, anonymous array for you, installs it into 404 C<%table>, and then pushes C<Athens> onto it. This is called 405 `autovivification'--bringing things to life automatically. Perl saw 406 that they key wasn't in the hash, so it created a new hash entry 407 automatically. Perl saw that you wanted to use the hash value as an 408 array, so it created a new empty array and installed a reference to it 409 in the hash automatically. And as usual, Perl made the array one 410 element longer to hold the new city name. 411 412 =head1 The Rest 413 414 I promised to give you 90% of the benefit with 10% of the details, and 415 that means I left out 90% of the details. Now that you have an 416 overview of the important parts, it should be easier to read the 417 L<perlref> manual page, which discusses 100% of the details. 418 419 Some of the highlights of L<perlref>: 420 421 =over 4 422 423 =item * 424 425 You can make references to anything, including scalars, functions, and 426 other references. 427 428 =item * 429 430 In B<Use Rule 1>, you can omit the curly brackets whenever the thing 431 inside them is an atomic scalar variable like C<$aref>. For example, 432 C<@$aref> is the same as C<@{$aref}>, and C<$$aref[1]> is the same as 433 C<${$aref}[1]>. If you're just starting out, you may want to adopt 434 the habit of always including the curly brackets. 435 436 =item * 437 438 This doesn't copy the underlying array: 439 440 $aref2 = $aref1; 441 442 You get two references to the same array. If you modify 443 C<< $aref1->[23] >> and then look at 444 C<< $aref2->[23] >> you'll see the change. 445 446 To copy the array, use 447 448 $aref2 = [@{$aref1}]; 449 450 This uses C<[...]> notation to create a new anonymous array, and 451 C<$aref2> is assigned a reference to the new array. The new array is 452 initialized with the contents of the array referred to by C<$aref1>. 453 454 Similarly, to copy an anonymous hash, you can use 455 456 $href2 = {%{$href1}}; 457 458 =item * 459 460 To see if a variable contains a reference, use the C<ref> function. It 461 returns true if its argument is a reference. Actually it's a little 462 better than that: It returns C<HASH> for hash references and C<ARRAY> 463 for array references. 464 465 =item * 466 467 If you try to use a reference like a string, you get strings like 468 469 ARRAY(0x80f5dec) or HASH(0x826afc0) 470 471 If you ever see a string that looks like this, you'll know you 472 printed out a reference by mistake. 473 474 A side effect of this representation is that you can use C<eq> to see 475 if two references refer to the same thing. (But you should usually use 476 C<==> instead because it's much faster.) 477 478 =item * 479 480 You can use a string as if it were a reference. If you use the string 481 C<"foo"> as an array reference, it's taken to be a reference to the 482 array C<@foo>. This is called a I<soft reference> or I<symbolic 483 reference>. The declaration C<use strict 'refs'> disables this 484 feature, which can cause all sorts of trouble if you use it by accident. 485 486 =back 487 488 You might prefer to go on to L<perllol> instead of L<perlref>; it 489 discusses lists of lists and multidimensional arrays in detail. After 490 that, you should move on to L<perldsc>; it's a Data Structure Cookbook 491 that shows recipes for using and printing out arrays of hashes, hashes 492 of arrays, and other kinds of data. 493 494 =head1 Summary 495 496 Everyone needs compound data structures, and in Perl the way you get 497 them is with references. There are four important rules for managing 498 references: Two for making references and two for using them. Once 499 you know these rules you can do most of the important things you need 500 to do with references. 501 502 =head1 Credits 503 504 Author: Mark Jason Dominus, Plover Systems (C<mjd-perl-ref+@plover.com>) 505 506 This article originally appeared in I<The Perl Journal> 507 ( http://www.tpj.com/ ) volume 3, #2. Reprinted with permission. 508 509 The original title was I<Understand References Today>. 510 511 =head2 Distribution Conditions 512 513 Copyright 1998 The Perl Journal. 514 515 This documentation is free; you can redistribute it and/or modify it 516 under the same terms as Perl itself. 517 518 Irrespective of its distribution, all code examples in these files are 519 hereby placed into the public domain. You are permitted and 520 encouraged to use this code in your own programs for fun or for profit 521 as you see fit. A simple comment in the code giving credit would be 522 courteous but is not required. 523 524 525 526 527 =cut
title
Description
Body
title
Description
Body
title
Description
Body
title
Body
Generated: Tue Mar 17 22:47:18 2015 | Cross-referenced by PHPXref 0.7.1 |