+ All Categories
Home > Documents > Parrot-Virtual-Machine.pdf

Parrot-Virtual-Machine.pdf

Date post: 08-Aug-2018
Category:
Upload: praveen-munagapati
View: 214 times
Download: 0 times
Share this document with a friend

of 170

Transcript
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    1/170

    PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.

    PDF generated at: Thu, 26 Sep 2013 13:32:25 UTC

    Parrot Virtual MachineA Book from English Wikibooks

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    2/170

    Contents

    Articles

    Wikibooks:Collections Preface 1

    Introduction 3

    Introduction To Parrot 7

    Introduction 7

    Building Parrot 10

    Running Parrot 14

    Programming For Parrot 17

    Parrot Programming 17

    Parrot Assembly Language 19

    Parrot Intermediate Representation 21

    Parrot Magic Cookies 31

    Multithreading and Concurrency 34

    Exception Handling 37

    Classes and Objects 38

    The Parrot Debugger 42

    Parrot Compiler Tools 43

    Parrot Compiler Tools 43

    Parrot Grammar Engine 47

    Not Quite Perl 53

    Optables and Expressions 60

    Advanced PGE 64

    Building A Compiler 67

    HLL Interoperation 70

    Parrot Hacking 72

    Parrot Internals 72

    IMCC and PIRC 74

    Run Core 75

    Memory and Garbage Collection 80

    PMC System 81

    String System 84

    Exception Subsystem 84

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    3/170

    IO Subsystem 85

    JIT and NCI 85

    Parrot Embedding 85

    Extensions 86

    Packfiles 86

    Appendices 87

    PIR Reference 87

    PASM Reference 87

    PAST Node Reference 87

    Languages on Parrot 88

    HLLCompiler Class 89

    Command Line Options 90

    Built-In PMCs 90

    Bytecode File Format 98

    VTABLE List 98

    "Squaak" Language Tutorial 103

    Squaak Tutorial 103

    Introduction 104

    Poking in Compiler Guts 108

    Squaak Details and First Steps 112

    PAST Nodes and More Statements 119

    Variable Declaration and Scope 126

    Scope and Subroutines 133

    Operators and Precedence 141

    Hash Tables and Arrays 149

    Wrap-Up and Conclusion 156

    Resources and Licensing 162Resources 162

    Licensing 163

    References

    Article Sources and Contributors 164

    Image Sources, Licenses and Contributors 166

    Article Licenses

    License 167

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    4/170

    Wikibooks:Collections Preface 1

    Wikibooks:Collections Preface

    This book was created by volunteers at Wikibooks (http://en.wikibooks.org).

    What is Wikibooks?

    Started in 2003 as an offshoot of the popular Wikipedia project, Wikibooks is

    a free, collaborative wiki website dedicated to creating high-quality textbooks

    and other educational books for students around the world. In addition to

    English, Wikibooks is available in over 130 languages, a complete listing of

    which can be found at http:// www.wikibooks. org. Wikibooks is a "wiki",

    which means anybody can edit the content there at any time. If you find an

    error or omission in this book, you can log on to Wikibooks to make

    corrections and additions as necessary. All of your changes go live on the

    website immediately, so your effort can be enjoyed and utilized by other

    readers and editors without delay.

    Books at Wikibooks are written by volunteers, and can be accessed and printed for free from the website. Wikibooks

    is operated entirely by donations, and a certain portion of proceeds from sales is returned to the Wikimedia

    Foundation to help keep Wikibooks running smoothly. Because of the low overhead, we are able to produce and sell

    books for much cheaper then proprietary textbook publishers can. This book can be edited by anybody at any

    time, including you. We don't make you wait two years to get a new edition, and we don't stop selling old versions

    when a new one comes out.

    Note that Wikibooks is not a publisher of books, and is not responsible for the contributions of its volunteer editors.

    PediaPress.com is a print-on-demand publisher that is also not responsible for the content that it prints. Please see

    our disclaimer for more information: http:/

    /

    en.

    wikibooks.

    org/

    wiki/

    Wikibooks:General_disclaimer .

    What is this book?

    This book was generated by the volunteers at Wikibooks, a team of people from around the world with varying

    backgrounds. The people who wrote this book may not be experts in the field. Some may not even have a passing

    familiarity with it. The result of this is that some information in this book may be incorrect, out of place, or

    misleading. For this reason, you should never rely on a community-edited Wikibook when dealing in matters of

    medical, legal, financial, or other importance. Please see our disclaimer for more details on this.

    Despite the warning of the last paragraph, however, books at Wikibooks are continuously edited and improved. If

    errors are found they can be corrected immediately. If you find a problem in one of our books, we ask that you bebold in fixing it. You don't need anybody's permission to help or to make our books better.

    Wikibooks runs off the assumption that many eyes can find many errors, and many able hands can fix them. Over

    time, with enough community involvement, the books at Wikibooks will become very high-quality indeed. You are

    invited to participate at Wikibooks to help make our books better. As you find problems in your book don't just

    complain about them: Log on and fix them! This is a kind of proactive and interactive reading experience that you

    probably aren't familiar with yet, so log on to http:// en. wikibooks. org and take a look around at all the

    possibilities. We promise that we won't bite!

    http://en.wikibooks.org/http://en.wikibooks.org/wiki/Wikibooks:General_disclaimerhttp://www.wikibooks.org./http://en.wikibooks.org/w/index.php?title=File%3AWikibooks-logo-en-noslogan.svghttp://en.wikibooks.org%29./
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    5/170

    Wikibooks:Collections Preface 2

    Who are the authors?

    The volunteers at Wikibooks come from around the world and have a wide range of educational and professional

    backgrounds. They come to Wikibooks for different reasons, and perform different tasks. Some Wikibookians are

    prolific authors, some are perceptive editors, some fancy illustrators, others diligent organizers. Some Wikibookians

    find and remove spam, vandalism, and other nonsense as it appears. Most wikibookians perform a combination of

    these jobs.

    It's difficult to say who are the authors for any particular book, because so many hands have touched it and so many

    changes have been made over time. It's not unheard of for a book to have been edited thousands of times by

    hundreds of authors and editors. You could be one of them too, if you're interested in helping out.

    Wikibooks in Class

    Books at Wikibooks are free, and with the proper editing and preparation they can be used as cost-effective

    textbooks in the classroom or for independent learners. In addition to using a Wikibook as a traditional read-only

    learning aide, it can also become an interactive class project. Several classes have come to Wikibooks to write new

    books and improve old books as part of their normal course work. In some cases, the books written by students oneyear are used to teach students in the same class next year. Books written can also be used in classes around the

    world by students who might not be able to afford traditional textbooks.

    Happy Reading!

    We at Wikibooks have put a lot of effort into these books, and we hope that you enjoy reading and learning from

    them. We want you to keep in mind that what you are holding is not a finished product but instead a work in

    progress. These books are never "finished" in the traditional sense, but they are ever-changing and evolving to meet

    the needs of readers and learners everywhere. Despite this constant change, we feel our books can be reliable and

    high-quality learning tools at a great price, and we hope you agree. Never hesitate to stop in at Wikibooks and make

    some edits of your own. We hope to see you there one day. Happy reading!

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    6/170

    Introduction 3

    Introduction

    What Is Parrot?

    Parrot is a virtual machine (VM), similar to the Java VM and the .NET VM. However, unlike these two which are

    designed for statically-typed languages like Java or C#, Parrot is designed for use with dynamically typed languagessuch as Perl, Python, Ruby, or PHP.

    The Parrot VM itself is written in the C programming language, which means thatin theoryit will be portable to

    a large number of different computer architectures and operating systems. It is written to be easily modular and

    extensible.

    Programmers can write in any of the languages for which a Parrot-capable compiler exists. Modules written in one

    language, such as Perl, can transparently interoperate with modules which have originally been written in any of the

    other languages supported by Parrot. This easy interoperability and native support for cutting-edge dynamic

    programming features makes Parrot an important tool for next-generation language designers and implementers.

    It is precisely because Parrot is intended to support so many diverse high level languages that Parrot has developed avery general and feature-rich architecture. Much of the Parrot architecture is still under active development, so those

    parts will not be able to be properly discussed here in this book quite yet. Once Parrot reaches a stable release, and

    more details are set in stone, this book will be able to provide a more comprehensive coverage.

    History of Parrot

    The Parrot project was born from the Perl 6 development project. As such, the history of Parrot, at least the early

    history of it, is closely tied to the history of Perl 6. In fact, understanding just how large and ambitious Perl 6 is,

    you'll start to understand why Parrot must have all the features it has.

    It was famously quoted about version 5 of the Perl programming language that "nothing can parse Perl but perl". The

    implication was that the perl executable was the only program that could reliably parse the Perl programming

    language. There were two reasons for this. First, the Perl language didn't follow any formal specification; The

    behavior of the perl interpreter was the definitive documentation for the actions of Perl. Second, the Perl

    programming language allowed the use ofsource filters, programs which could modify their own source code prior

    to execution. This means that to reliably parse and understand a Perl program, you needed to be able to execute the

    source filters reliably. The only program that could do both was perl.

    The next planned version of Perl, Perl 6, was supposed to be a major rewrite of the language. In addition to

    standardizing and bringing sanity to all the features which had slowly entered the language grammar, it was decided

    that Perl 6 would be a formal specification first, and implementations of that specification later.

    The name "Parrot" was first used as an April Fool's joke. The story claimed that the Perl and Python languages(which are competitors, and which were both undergoing major redesigns) were going to merge together into a

    single language named Parrot. This was, of course, a hoax, but the idea was a powerful one. When the project was

    started to create a virtual machine that would be capable of running not only Perl 6, but also Python and other

    dynamic languages, the name Parrot was a perfect fit.

    The first release of Parrot, 0.0.1, was released in September 2001. The development team has prepared a stable point

    release on the third Tuesday of every month.

    http://en.wikibooks.org/w/index.php?title=Perl_6_Programming
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    7/170

    Introduction 4

    The Parrot Foundation

    The Parrot Foundation was established in mid 2008 to serve as an advocate for Parrot. The Parrot Foundation is a

    non-profit charity organization in the United States, and donations to the foundation are tax-deductable.

    Prior to the creation of the Parrot Foundation, Parrot was managed and overseen by the Perl Foundation. This

    relationship was historical in nature, due to the fact that Parrot was originally intended just to be the backend for the

    Perl 6 programming language. Since Parrot has grown beyond that, and is attempting to deal equally with all

    high-level dynamic programming languages, it was decided to become separate from the Perl Foundation.

    Parrot's website is http://www.parrot. org

    Who Is This Book For?

    This book is for readers at the intermediate to advanced level with a solid background in computer programming.

    Perl Programming would be a good start, although a background in any dynamic language would be helpful. Having

    a background in Compiler Construction, Regular Expressions, or the compiler-building tools Lex and Yacc would

    also be a benefit.

    For the sections about Parrot hacking, a background knowledge of C Programming is required.

    What Will We Cover?

    This book is going to serve as, at least, a basic introduction to the Parrot Virtual Machine. We will cover basic

    programming for Parrot in the lowest-level languages that it supports: PIR and PASM. We will also discuss one of

    the greatest strengths of the Parrot platform, the Parrot Compiler Tools (PCT), which enables compilers to be written

    easily for higher-level languages like Perl and Python.

    Later sections will actually delve into the Parrot internals, and discuss how Parrot works and how to contribute code

    for the Parrot development project. Extensive reference materials at the end of the book will try to keep track of the

    information that is most necessary for developers.

    Where To Get More Information

    The definative source for Parrot information and documentation is the Parrot project website, http:// www. parrot.

    org.Parrot programmers, hackers, and enthusiasts also chat in the Parrot IRC chatroom[1]

    .

    How To Get Involved In Parrot Development

    The Parrot development process is large and varied. Depending on skill level, there are many opportunities for a

    person to get involved in Parrot development. Here are some examples:

    If you are good at C programming

    If you know C programming, help is always needed to work on Parrot. In addition to normal development

    tasks, there are bug reports to resolve, compile errors to fix, new platforms to port to, and optimizations to

    perform. Parrot needs to be ported to many different systems, and it needs to be properly tested on all of them.

    If you are good with Perl programming

    Much of the Parrot build tools are written in Perl 5. However, there is also a massive development effort to

    support the Perl 6 project. An intermediate language which is similar to Perl 6 but with many missing features

    called Not Quite Perl (NQP) is used to implement compilers for higher-level languages. If you are good with

    Perl and are willing to learn Perl 6 and NQP, there is a lot of compiler-implementation work that needs to be

    done.

    If you are good with system administration

    http://irc//irc.perl.org/Parrothttp://www.parrot.org./http://www.parrot.org./http://en.wikibooks.org/w/index.php?title=C_Programminghttp://en.wikibooks.org/w/index.php?title=Regular_Expressionshttp://en.wikibooks.org/w/index.php?title=Compiler_Constructionhttp://en.wikibooks.org/w/index.php?title=Perl_Programminghttp://www.parrot.org/
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    8/170

    Introduction 5

    Parrot needs to be built and tested regularly. People are always needed who are willing to perform regular

    builds and tests of Parrot. If you are willing to set up automated build bot to perform regular builds and tests,

    that's even better.

    If you can write

    This book needs your help, and anybody can edit it. Also, there are a number of other book-writing projects

    concerning Parrot that are looking for active authors and editors. The more is written about Parrot, the morenew users will be able to learn about it.

    If you don't fall cleanly into any of these categories, there are other opportunities to help as well. This might be a

    good opportunity for you to learn a new skill, like programming Perl 6, PIR, or NQP. If you are interested in writing

    or editing, you can help with this wikibook too!

    Parrot Developers

    There are several different roles that people have taken up in Parrot development, even though there is no centralized

    management hierarchy. Volunteers tend to fulfill certain roles that they enjoy and that they have skill at.

    ArchitectThe Parrot Architect, currently w:Allison Randal, is in charge with laying out the overall design specifications

    for Parrot. The architect has the final say in important decisions and is responsible to ensure that design

    documents are up to date. By laying out the overall requirements of the system, other volunteers are able to

    contribute to areas where they are most interested.

    Pumpking

    The Pumpking has oversight of the Parrot source repository, and is also the lead developer. The Pumpking

    defines coding standards that all contributors must follow, and helps to coordinate other contributors.

    Release Managers

    Parrot has a schedule of making releases approximately once a month. The release manager oversees thisprocess, and ensures that releases are high quality. Release managers will control when new features can be

    added, and when code should be frozen for debugging. Pre-release debugging sessions are very productive and

    important periods for Parrot development, and ensure that many bugs get fixed between each release.

    Committer

    A committer is a person with write access to the Parrot SVN repository. Committers typically have submitted

    several patches and participated in Parrot-related discussions.

    Metacommitter

    A metacommitter is a person who has write access to the Parrot SVN repository and is also capable of

    promoting new committers. The architect and the Pumpking are automatically metacommitters, but there areseveral others too.

    Among the above groups, there are other designations as well. This is because many committers tend to focus their

    efforts on a relatively small portion of the Parrot development effort.

    Core Developer

    A person who works on Parrot internals, typically one or two subsystems. Core developers need to be skilled

    in C programming, and also need to work with many development utilities written in Perl.

    Compiler Developer

    These developers, like the Core Developers are working on the internals of Parrot, typically by writing lots of

    C code. In contrast, Compiler Developers focus their effort on the various compiler front-ends such as IMCC,PIRC, PGE, or TGE.

    http://en.wikipedia.org/wiki/Allison_Randal
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    9/170

    Introduction 6

    High-Level Language Developer

    A high-level language developer is a person who is working to implement a high-level language on Parrot.

    Even though they have commit access to the whole repository, many high-level language developers will

    focus only on a single language implementation. High-level language developers need to be skilled in PCT and

    many of the Perl 6-based development tools for HLLs.

    Build Managers

    Build managers help to create and maintain tools that other developers rely on.

    Testers

    Testers create and maintain a suite of hundreds and thousands of tests to verify the operations of Parrot, its

    subsystems, its compilers and the high-level languages that run on it.

    Platform Porters

    A platform porter ensures that Parrot can be built on multiple platforms. Porters must build and test Parrot on

    different platforms, and also create and distribute pre-compiled installation packages for different platforms.

    This certainly isn't an exhaustive list of possible roles either. If you have programming skills, but don't know if you

    fit in well to any of the designations above, your help is still needed.

    Resources

    http:/ /www.parrotcode.org/docs/intro.html

    http:/ /www.parrotcode.org/docs/roadmap.html

    http:/ /www.parrotcode.org/docs/parrothist.html

    References

    [1] irc://irc.perl.org/Parrot

    http://irc//irc.perl.org/Parrothttp://www.parrotcode.org/docs/parrothist.htmlhttp://www.parrotcode.org/docs/roadmap.htmlhttp://www.parrotcode.org/docs/intro.html
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    10/170

    7

    Introduction To Parrot

    IntroductionWhat Is Parrot?

    Parrot is a virtual machine (VM), similar to the Java VM and the .NET VM. However, unlike these two which are

    designed for statically-typed languages like Java or C#, Parrot is designed for use with dynamically typed languages

    such as Perl, Python, Ruby, or PHP.

    The Parrot VM itself is written in the C programming language, which means thatin theoryit will be portable to

    a large number of different computer architectures and operating systems. It is written to be easily modular and

    extensible.

    Programmers can write in any of the languages for which a Parrot-capable compiler exists. Modules written in one

    language, such as Perl, can transparently interoperate with modules which have originally been written in any of the

    other languages supported by Parrot. This easy interoperability and native support for cutting-edge dynamic

    programming features makes Parrot an important tool for next-generation language designers and implementers.

    It is precisely because Parrot is intended to support so many diverse high level languages that Parrot has developed a

    very general and feature-rich architecture. Much of the Parrot architecture is still under active development, so those

    parts will not be able to be properly discussed here in this book quite yet. Once Parrot reaches a stable release, and

    more details are set in stone, this book will be able to provide a more comprehensive coverage.

    History of Parrot

    The Parrot project was born from the Perl 6 development project. As such, the history of Parrot, at least the early

    history of it, is closely tied to the history of Perl 6. In fact, understanding just how large and ambitious Perl 6 is,

    you'll start to understand why Parrot must have all the features it has.

    It was famously quoted about version 5 of the Perl programming language that "nothing can parse Perl but perl". The

    implication was that the perl executable was the only program that could reliably parse the Perl programming

    language. There were two reasons for this. First, the Perl language didn't follow any formal specification; The

    behavior of the perl interpreter was the definitive documentation for the actions of Perl. Second, the Perl

    programming language allowed the use ofsource filters, programs which could modify their own source code prior

    to execution. This means that to reliably parse and understand a Perl program, you needed to be able to execute the

    source filters reliably. The only program that could do both was perl.

    The next planned version of Perl, Perl 6, was supposed to be a major rewrite of the language. In addition to

    standardizing and bringing sanity to all the features which had slowly entered the language grammar, it was decided

    that Perl 6 would be a formal specification first, and implementations of that specification later.

    The name "Parrot" was first used as an April Fool's joke. The story claimed that the Perl and Python languages

    (which are competitors, and which were both undergoing major redesigns) were going to merge together into a

    single language named Parrot. This was, of course, a hoax, but the idea was a powerful one. When the project was

    started to create a virtual machine that would be capable of running not only Perl 6, but also Python and other

    dynamic languages, the name Parrot was a perfect fit.

    The first release of Parrot, 0.0.1, was released in September 2001. The development team has prepared a stable pointrelease on the third Tuesday of every month.

    http://en.wikibooks.org/w/index.php?title=Perl_6_Programming
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    11/170

    Introduction 8

    The Parrot Foundation

    The Parrot Foundation was established in mid 2008 to serve as an advocate for Parrot. The Parrot Foundation is a

    non-profit charity organization in the United States, and donations to the foundation are tax-deductable.

    Prior to the creation of the Parrot Foundation, Parrot was managed and overseen by the Perl Foundation. This

    relationship was historical in nature, due to the fact that Parrot was originally intended just to be the backend for the

    Perl 6 programming language. Since Parrot has grown beyond that, and is attempting to deal equally with all

    high-level dynamic programming languages, it was decided to become separate from the Perl Foundation.

    Parrot's website is http://www.parrot. org

    Who Is This Book For?

    This book is for readers at the intermediate to advanced level with a solid background in computer programming.

    Perl Programming would be a good start, although a background in any dynamic language would be helpful. Having

    a background in Compiler Construction, Regular Expressions, or the compiler-building tools Lex and Yacc would

    also be a benefit.

    For the sections about Parrot hacking, a background knowledge of C Programming is required.

    What Will We Cover?

    This book is going to serve as, at least, a basic introduction to the Parrot Virtual Machine. We will cover basic

    programming for Parrot in the lowest-level languages that it supports: PIR and PASM. We will also discuss one of

    the greatest strengths of the Parrot platform, the Parrot Compiler Tools (PCT), which enables compilers to be written

    easily for higher-level languages like Perl and Python.

    Later sections will actually delve into the Parrot internals, and discuss how Parrot works and how to contribute code

    for the Parrot development project. Extensive reference materials at the end of the book will try to keep track of the

    information that is most necessary for developers.

    Where To Get More Information

    The definative source for Parrot information and documentation is the Parrot project website, http:// www. parrot.

    org.Parrot programmers, hackers, and enthusiasts also chat in the Parrot IRC chatroom[1]

    .

    How To Get Involved In Parrot Development

    The Parrot development process is large and varied. Depending on skill level, there are many opportunities for a

    person to get involved in Parrot development. Here are some examples:

    If you are good at C programming

    If you know C programming, help is always needed to work on Parrot. In addition to normal development

    tasks, there are bug reports to resolve, compile errors to fix, new platforms to port to, and optimizations to

    perform. Parrot needs to be ported to many different systems, and it needs to be properly tested on all of them.

    If you are good with Perl programming

    Much of the Parrot build tools are written in Perl 5. However, there is also a massive development effort to

    support the Perl 6 project. An intermediate language which is similar to Perl 6 but with many missing features

    called Not Quite Perl (NQP) is used to implement compilers for higher-level languages. If you are good with

    Perl and are willing to learn Perl 6 and NQP, there is a lot of compiler-implementation work that needs to be

    done.

    If you are good with system administration

    http://irc//irc.perl.org/Parrothttp://www.parrot.org./http://www.parrot.org./http://en.wikibooks.org/w/index.php?title=C_Programminghttp://en.wikibooks.org/w/index.php?title=Regular_Expressionshttp://en.wikibooks.org/w/index.php?title=Compiler_Constructionhttp://en.wikibooks.org/w/index.php?title=Perl_Programminghttp://www.parrot.org/
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    12/170

    Introduction 9

    Parrot needs to be built and tested regularly. People are always needed who are willing to perform regular

    builds and tests of Parrot. If you are willing to set up automated build bot to perform regular builds and tests,

    that's even better.

    If you can write

    This book needs your help, and anybody can edit it. Also, there are a number of other book-writing projects

    concerning Parrot that are looking for active authors and editors. The more is written about Parrot, the morenew users will be able to learn about it.

    If you don't fall cleanly into any of these categories, there are other opportunities to help as well. This might be a

    good opportunity for you to learn a new skill, like programming Perl 6, PIR, or NQP. If you are interested in writing

    or editing, you can help with this wikibook too!

    Parrot Developers

    There are several different roles that people have taken up in Parrot development, even though there is no centralized

    management hierarchy. Volunteers tend to fulfill certain roles that they enjoy and that they have skill at.

    ArchitectThe Parrot Architect, currently w:Allison Randal, is in charge with laying out the overall design specifications

    for Parrot. The architect has the final say in important decisions and is responsible to ensure that design

    documents are up to date. By laying out the overall requirements of the system, other volunteers are able to

    contribute to areas where they are most interested.

    Pumpking

    The Pumpking has oversight of the Parrot source repository, and is also the lead developer. The Pumpking

    defines coding standards that all contributors must follow, and helps to coordinate other contributors.

    Release Managers

    Parrot has a schedule of making releases approximately once a month. The release manager oversees thisprocess, and ensures that releases are high quality. Release managers will control when new features can be

    added, and when code should be frozen for debugging. Pre-release debugging sessions are very productive and

    important periods for Parrot development, and ensure that many bugs get fixed between each release.

    Committer

    A committer is a person with write access to the Parrot SVN repository. Committers typically have submitted

    several patches and participated in Parrot-related discussions.

    Metacommitter

    A metacommitter is a person who has write access to the Parrot SVN repository and is also capable of

    promoting new committers. The architect and the Pumpking are automatically metacommitters, but there areseveral others too.

    Among the above groups, there are other designations as well. This is because many committers tend to focus their

    efforts on a relatively small portion of the Parrot development effort.

    Core Developer

    A person who works on Parrot internals, typically one or two subsystems. Core developers need to be skilled

    in C programming, and also need to work with many development utilities written in Perl.

    Compiler Developer

    These developers, like the Core Developers are working on the internals of Parrot, typically by writing lots of

    C code. In contrast, Compiler Developers focus their effort on the various compiler front-ends such as IMCC,PIRC, PGE, or TGE.

    http://en.wikipedia.org/wiki/Allison_Randal
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    13/170

    Introduction 10

    High-Level Language Developer

    A high-level language developer is a person who is working to implement a high-level language on Parrot.

    Even though they have commit access to the whole repository, many high-level language developers will

    focus only on a single language implementation. High-level language developers need to be skilled in PCT and

    many of the Perl 6-based development tools for HLLs.

    Build Managers

    Build managers help to create and maintain tools that other developers rely on.

    Testers

    Testers create and maintain a suite of hundreds and thousands of tests to verify the operations of Parrot, its

    subsystems, its compilers and the high-level languages that run on it.

    Platform Porters

    A platform porter ensures that Parrot can be built on multiple platforms. Porters must build and test Parrot on

    different platforms, and also create and distribute pre-compiled installation packages for different platforms.

    This certainly isn't an exhaustive list of possible roles either. If you have programming skills, but don't know if you

    fit in well to any of the designations above, your help is still needed.

    Resources

    http:/ /www.parrotcode.org/docs/intro.html

    http:/ /www.parrotcode.org/docs/roadmap.html

    http:/ /www.parrotcode.org/docs/parrothist.html

    Building Parrot

    Obtaining Parrot

    The most recent development release of Parrot can be downloaded from CPAN[1]

    .

    Development of Parrot is controlled through the SVN repository at http:// svn. parrot. org/ parrot/ . The most

    up-to-date version of Parrot can be obtained from https://svn.parrot. org/parrot/trunk/via svn checkout.

    Building Parrot From Source

    Parrot is currently available as a source code download, although some people are trying to maintain precompiled

    versions for download as well. These versions are typically available for Windows, Cygwin, Debian, and Red Hat.

    Other binary distributions may be added in the future. Instructions for installing a precompiled binary distribution of

    Parrot for your system vary depending on the particular platform and the method in which it was bundled. Consult

    the accompanying documentation for any distribution you download for more details. This page will not discuss

    these particular distributions, only the method of building Parrot from the original source code.

    On a Windows platform, substitute the freely-available nmake instead ofmake.

    Currently the Parrot build process requires the use ofmake, a working C compiler, and a working Perl 5 installation.

    Perl should be version 5.8 or higher. Without these things, it will not be possible for you to build Parrot. Automated

    testing is performed on a variety of systems with various combinations of these tools, and any particular revision

    should be able to be compiled properly. If you have problems compiling Parrot on your system, send an email to the

    Parrot Porters mailing list with details of the problems, and one of the Parrot developers will try to help fix it.

    https://svn.parrot.org/parrot/trunk/http://svn.parrot.org/parrot/http://www.parrotcode.org/release/develhttp://www.parrotcode.org/docs/parrothist.htmlhttp://www.parrotcode.org/docs/roadmap.htmlhttp://www.parrotcode.org/docs/intro.html
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    14/170

    Building Parrot 11

    Configure.pl

    Notice that Configure.pl has a capitalized first letter. This is an important distinction on Unix and Linux

    systems which are case sensitive.

    The first step in building Parrot is to run the Configure.pl script which will perform some basic tests on your

    system and produce a makefile. To automatically invoke Configure.pl with the most common options, run the

    program Makefile.pl instead. The configuration process performs a number of tests on your system to

    determine some important parameters. These tests may make several minutes on some systems, so be patient. In

    addition, configuration creates a number of platform-specific code files for your system. Without these generated

    files in place, the build process cannot procede.

    After Configure.pl is finished executing, you should have a file named Makefile (with no suffix). From the

    shell, go to the Parrot directory and type the command "make" or "nmake" on Windows. This will start the process

    to build Parrot. The Parrot build process could take several minutes because there are a number of steps. We will

    discuss some of these steps in a later section.

    MANIFEST

    The root directory of Parrot contains a file called MANIFEST. MANIFEST contains a list of all necessary files in

    the Parrot repository. If you add a new file to the Parrot source tree, make sure to add that file to MANIFEST.

    Configure.pl checks MANIFEST to ensure all files exist properly before attempting to build.

    Configure.pl Options

    Depending on what tasks you want to perform, or how you are using Parrot, there are a number of options that can

    be specified to Configure.pl. These options may change the makeup of several generated files, including the

    Makefile. Here, we will list some of these options:

    --help Shows a help message

    --version Prints version information about Configure.pl

    --verbose Prints extra information to the console

    --fatal If any step fails, kill Configure immediately and do not run additional tests

    --silent No output to the console

    --nomanicheck Do not check the file MANIFEST to ensure all files exist.

    --languages Specify a comma-separated list of languages to build also, after Parrot has been built.

    --ask Ask the user for answers to common questions, instead of running probes.

    --test Test the configuration tools first, then Configure, then the build tools. Use --test=configure to test the configuration tools

    then run Configure.pl. Use--test=build to run Configure.pl and then also test the build tools.

    --debugging set --debugging=0 to turn off debugging. Debugging is on by default.

    --inline Specify whether your C compiler supports inline code using the C inline keyword.

    --optimize Compile Parrot using compiler optimizations, and a few other speed-up tricks. Creates a faster bird, but may expose more

    errors and failures. Use --optimize=(flags) to specify compiler optimization flags to use.

    --parrot_is_shared Link Parrot dynamically to libparrot, instead of linking statically.

    --m=32 On a 64-bit platform, compile a 32-bit Parrot.

    --profile Turn on profiling. Only used with the GCC compiler, for now.

    --cage Turn on additional warnings, for the Cage Cleaners.

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    15/170

    Building Parrot 12

    --cc Specify the compiler to use. For instance, --cc=gcc for the GCC compiler, and --cc=cl for Microsoft's C++ compiler. Use

    --ccflags to specify any additional compiler flags, and --ccwarn to turn on any additional warnings. Here are some more

    options:

    1.1. To build Parrot with a C++ compiler, use --cxx to specify the compiler to use.

    2.2. Use --libs to specify any additional libraries to link Parrot with.

    3.3. Use --link to specify a linker

    4.4. Use --linkflags to send options to the linker5.5. Use --ld to select a loader

    6.6. Use --ldflags to send flags to the loader

    7.7. Use --make to specify what make utility to use

    --intval --floatval

    --opcode

    Set the C data types to use for each value. Notice that --intval and --opcode must be the same, or strange errors may result.

    --ops Specify any optional OPS files to build.

    --pmc Specify any optional PMC files to build.

    --without-gmp Do not use

    --without gdbm Build Parrot without GMP

    --without-opengl Build Parrot without OpenGL support

    --without-crypto Build Parrot witout the cryptography library

    --icu-config Specify a location for the Unicode ICU library on your system.

    --without-icu Build Parrot without ICU and Unicode support.

    --maintainer Compile IMCC's tokenizer and parser using Lex and Yacc (or equivalent). Use --lex to specify the name of the lexer, nd

    --yacc to specify the name of the parser.

    --miniparrot Build miniparrot

    --prefix Specify a path prefix

    --exec-prefix Specify an execution path prefix

    --bindir The directory for binary executable files on your system

    --sbindir The system admin executables folder

    --libexecdir Program executables folder

    --datadir read-only data directory for machine-independent data.

    --sysconfdir read-only data that is machine dependent

    --sharedstatedir modifiable architecture-independent data directory

    --localstatedir modifiable architecture-dependet data directory

    --libdir Object code directory

    --includedir Folder for Compiler include files

    --oldincludedir C header file directory for old versions of GCC

    --infodir info documentation directory

    --mandir Man pages docmentation folder

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    16/170

    Building Parrot 13

    Parrot Executable

    After the build process you should have, among other things, an executable file for Parrot. This will be, on Windows

    systems, named parrot.exe. On other systems, it may be named slightly differently, such as with no suffix.

    Two other programs of interest are created, miniparrot.exe and libparrot.dll. These files will be named

    something different if you are not on a Windows system.

    Make Targets

    For readers who are not familiar with the make program, it is a program which can be used to automatically

    determine how to build a software project from source code files. In a makefile, you specify a list of dependencies,

    and the method for producing one file from others. make then determines the order and method to build your

    project.

    make has targets, which means a single makefile can have multiple goals. For Parrot, a number of targets have been

    defined which can help with building, debugging, and testing. Here are a list of some of the make targets:

    Command Explanation

    make Builds Parrot from source. Only rebuilds components that have changed from the last build.

    make clean removes all the intermediate files that are left over from the build process. Cleans the directory tree so that Parrot can be

    completely rebuilt.

    make

    realclean

    Completely removes all temporary files, all intermediate files, and all makefiles. After a make realclean command, you

    will need to run Configure.pl again.

    make test Builds Parrot, if needed, and runs the test suite on it. If there are errors in the test results, you can try to fix them yourself, or you

    can submit a bug report to the Parrot developers. This is always appreciated.

    make

    fulltest

    Build Parrot, if needed, and runs the test suit on every run core. This can be a very time-consuming operation, and is typically

    only performed prior to a new release.

    make smoke Performs smoke testing. This runs the parrot test suite and attempts to transmit the test results directly to the Parrot developmentservers. Smoke test results help the developers to keep track of the systems where Parrot is building correctly.

    Submitting Bugs and Patches

    As we mentioned above, smoke testing is an easy way for you to help submit information about Parrot on your

    system. Since Parrot is supposed to support so many different computer architectures and operating systems, it can

    be difficult to know how Parrot is performing on all of them.

    Besides smoke testing, there are a number of ways that you can submit a bug report to Parrot. If you are a capable

    programmer, you may be interested in trying to make fixes and submit patches as well.

    Resources

    http:/ /www.parrotcode.org/docs/gettingstarted.html

    References

    [1] http://www.parrotcode.org/release/devel

    http://www.parrotcode.org/release/develhttp://www.parrotcode.org/docs/gettingstarted.html
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    17/170

    Running Parrot 14

    Running Parrot

    Running Parrot

    Parrot can be run from the command line in a number of modes with a number of different options. There are three

    forms of input that Parrot can work with directly: Parrot Assembly Language (PASM), which is a low-level humanreadable assembly language for the virtual machine, Parrot Intermediate Representation (PIR) which is a

    syntactic overlay on PASM with nicer syntax for some expressions, and Parrot Bytecode (PBC) which is a

    compiled binary input format.

    PIR and PASM are converted to PBC during normal execution. Only PBC can be executed by Parrot directly. The

    compilation stage to convert PIR or PASM to PBC takes some time, and can be done separately. We'll be talking

    about these processes a little later.

    Parrot Information

    To get information about the current Parrot version, type:

    parrot -V

    To get a list of command-line options and their purposes, type:

    parrot -h

    We'll discuss all the various command-line options later in this book, but it's always good to have multiple resources

    when a question pops up.

    File Types

    Files that end in .pbc are treated as parrot bytecode files and are executed immediately. Files that end in .pir or

    .pasm are treated as PIR or PASM source code files, respectively, and interpreted. To compile PIR or PASM into

    bytecode, use the -o switch, as such:

    parrot -o output.pbc input.pir

    or

    parrot -o output.pbc input.pasm

    Notice that if we use a .pasm file extension, we can output to PASM instead of PBC:

    parrot -o output.pasm input.pir

    To force output of PBC even if the output file does not have a .pbc extension, use the --output-pbc switch. To

    run the generated PBC file after you generate it, use the -r switch.

    To force a file to be run as PASM regardless of the file extension, use the -a switch.

    To force a file to be run as a PBC file, regardless of the file extension, use the -c switch.

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    18/170

    Running Parrot 15

    Runtime Options

    Parrot can operate with a number of additional options too.

    Optimizations

    Optimizations can take time to perform, but increase the execution speed of the resulting program. For simple

    programs, short and sloppy one-time programs, extensive optimizations might not make much sense. You would

    spend more time optimizing a piece of software then you even spend executing it. However, for programs which are

    run frequently, or for very large programs, or programs which must run continuously with good performance,

    optimizations can be a valuable thing. Compile a program once with optimizations, and the output optimized

    bytecode can be saved to disk, never needing to be optimized again (unless Parrot integrates better optimizations).

    Parrot has multiple optimization options, depending on the extensiveness of the optimizations to be performed. Each

    can be activated using different commandline switches in the form -Ox where the x is a character representing the

    type of optimization to perform:

    Flag Description

    -O0 no optimizations, this is the default mode

    -O1 or -O optimizations without life info (e.g. branches)

    -O2 optimizations with life info

    -Op rewrite I and N PASM registers most used first

    -Ot select fastest runcore (default with -O1 and -O2)

    -Oc turns on the optional/experimental tail call optimizations

    Life info is an analysis step where code and data is traced to determine control flow patterns and lifetimes of local

    variables. Knowing the areas where certain variables are used and not used enables registers to be reused instead of

    having to allocate new ones. Knowing when certain code is unreachable enables the optimizer to ignore it

    completely.

    Run Cores

    The run core is the central loop of the Parrot program, and there are several different runcores available that specify

    the performance and capabilities of Parrot. Runcores determine how parrot executes the bytecode instructions that

    are passed into the interpreter. Runcores can perform certain tasks such as bounds-checking, testing, or debugging.

    Other runcores have been optimized to operate extremely quickly. Implementation details about the various cores

    can be found in src/runops_cores.c .

    Different cores can be activated by passing particular switches at the command-line. The sections below will discuss

    the various runcores, what they do, how they work, and how to activate them.

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    19/170

    Running Parrot 16

    Basic Cores

    Slow Core

    The default "slow" core treats all ops as individual C functions. Each function is called, and returns the address of the

    next instruction operation. Many cores, such as the tracing and debugging cores, are based on the slow core design.

    Fast core

    The fast core is a bare-bones core that does not perform any special operations such as tracing, debugging, or

    bounds-checking.

    Computed Goto Core

    Computed goto is a feature of some compilers that allows a goto instruction to target a variable containing the

    address of a label, not necessarily directly to a label. By caching the addresses of all labels into an array, a jump can

    be made directly to the necessary instructions. This avoids the overhead of multiple subroutine calls, and can be very

    quick on platforms that support it. For more information about the workings of the computed-goto runcore, see the

    generated file src/ops/core_ops_cg.c .

    Switch Core

    The switch core uses the standard C switch and case structure to select the next operation to run. At each

    iteration, a switch is performed, and each case represents one of the ops. After the op has been performed, control

    flow jumps back to the top of the switch and the cycle repeats.

    Switch statements, especially those that use many consecutive values, are typically converted by the compiler into

    jump tables which perform very similarly to computed-goto jumps.

    Variant Cores

    The above cores are the basic designs upon which other specialized cores are based.

    mod_parrot

    Some members of the Parrot team have developed an extension for the Apache webserver that allows Parrot to be

    used to generate server-side content. The result of this work is mod_parrot, which can be used to produce web

    sites using PIR or PASM. This has limited usefulness by itself. However, mod_parrot allows the creation of

    additional modules for languages with compilers that target parrot. One notable module like this, mod_perl6 is a

    bytecode module that runs on top of mod_parrot.

    More information about mod_parrot is available at it's website: http://www.parrot.org/mod_parrot

    http://www.parrot.org/mod_parrot
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    20/170

    17

    Programming For Parrot

    Parrot ProgrammingParrot Programming

    The Parrot Virtual Machine (PVM) can be programmed using a variety of languages, scripts, and techniques. This

    versatility can be confusing at first. Here are some ways that Parrot can be programmed:

    1. Parrot Assembly Language (PASM). This is the lowest human-readable way to program Parrot, and is very

    similar to traditional assembly languages.

    2. Parrot Intermediate Representation (PIR). This is a higher-level language which is easier to program in than

    PASM, and significantly more common.

    3. Not Quite Perl (NQP). This is a bare-bones partial implementation of the Perl 6 language, which is designed forbootstrapping. It is higher-level than PIR and has many of the features and the capabilities of Perl 6. At the

    moment NQP is not fully-featured and must be compiled separately into bytecode before it can be run on Parrot.

    4. Custom Languages. Using the Parrot Compiler tools (PCT), new dynamic languages can easily be implemented

    on Parrot. Once a parser and libraries have been created for a language, that language can be used to program for

    Parrot. Many common programming languages, including Perl 6, Python, and JavaScript (ECMAScript) are being

    implemented on Parrot. We will discuss more languages in a later section.

    Programming Steps

    There are a number of different methods to program Parrot, as we see in the list above. However, different

    programming methods require different steps. Here, we will give a very brief overview of some of the ways you

    program Parrot.

    PASM and PIR

    A program written in PASM or PIR, such as Foo.pasm or Bar.pir, can be run in one of two different

    ways. They can be interpreted directly by typing (on most Windows and Unix/Linux systems):

    ./parrot Foo.pasm

    or

    ./parrot Bar.pir

    This will run Parrot in interpreter mode. However, we can compile these programs down to Parrot Bytecode

    (PBC) using the following flags:

    ./parrot -o Foo.pbc Foo.pir

    ./parrot -o Bar.pbc Bar.pir

    Of course, you can name the output files anything you want. Once you have a PBC file, you can run it like

    this:

    ./parrot Foo.pbc

    NQP

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    21/170

    Parrot Programming 18

    NQP must be compiled down to PIR using the NQP compiler. This is located in the compilers/nqp

    directory of the Parrot repository

    High Level Languages

    To program parrot in a higher-level language than NQP or PIRsuch as Perl 6, Python, or Rubythere must

    first be a compiler available for that language. To run file Foo.pl for example (".pl" is the file extension for

    Perl programs), you would type:

    ./parrot languages/perl6/perl6.pbc Foo.pl

    This runs the Perl 6 compiler on Parrot, and passes the file name Foo.pl to the compiler. To output a file

    into PIR or PBC, you would use the --target= option to specify an output format.

    Virtual Machines?

    One term that we are going to be using frequently in this book is "Virtual Machine", or VM for short. It's worth

    discussing now what exactly a VM is.

    Before talking about virtual machines, let's consider actual computer hardware first. In an ordinary computer system,a native machine, there is a microprocessor which takes instructions and performs the necessary actions. Those

    instructions are written in a high level language and compiled into the binary machine code that the processor uses.

    The problem with this is that different types of processors use a different machine code, and to get a program to run

    on different platforms it needs to be recompiled for each.

    A virtual machine, unlike a regular computer processor, is built using software, not hardware. The virtual machine is

    written in a high level language, and compiled to machine code as usual. However, programs that run on the virtual

    machine are compiled into bytecode instead of machine code. This bytecode runs on top of the virtual machine, and

    the virtual machine converts it into processor instructions.

    Here is a table that summarizes some of the important differences between a virtual machine and a native machine:

    Native Machine Virtual Machine

    Implementation Hardware Software

    Speed of Execution Fast Slow

    Native Machine Code

    Programming

    Must compile every program into native

    machine code

    Must only compile the virtual machine into native machine code,

    everything else is converted to bytecode

    Portability Every program must be recompiled on every

    new hardware platform

    Programs only need to be compiled into bytecode once, and can run

    anywhere a VM is installed

    Extensibility Impossible Virtual machines can be improved, extended, patched, and added to

    over time

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    22/170

    Parrot Assembly Language 19

    Parrot Assembly Language

    Parrot Assembly Language

    The Parrot Virtual Machine (PVM) operates on a special-purpose bytecode format. All instructions in PVM are

    converted into bytecode instructions to be operated on by the virtual machine. In the same way that ordinaryassembly languages share a one-to-one correspondence to the underlying machine code words, so too do the Parrot

    bytecode words have a similar correspondence to a Parrot Assembly Language (PASM).

    The PASM is very similar to traditional assembly languages, except that the instructions provide access to many of

    the dynamic and high-level features of the Parrot system.

    Instruction Types and Operands

    Internally to parrot, there are many many different instructions. Some instructions are just variations of each other

    with the same behavior, but different arguments. For instance, there are instructions for:

    add_n_n_n

    add_i_i_i

    add_i_i_ic

    add_n_n_nc

    The letters after the name of the instruction specify what kinds of operands that instruction requires. add_i_i_ic

    takes an integer (i) and an integer constant (ic) and returns an integer (i). add_n_n_n takes two floating point

    numbers and returns a floating point number.

    In PASM, when you write the following statement:

    add $I0, $I1, $I2

    Parrot looks up the appropriate instruction from the list, add_i_i_i and calls it. The user sees 1 instruction,

    "add", but Parrot actually has multiple instructions and decides which to use automatically for you. If you type the

    following into Parrot:

    add $P0, $I1, $I2

    You will get an error message that there is no such instruction add_p_i_i. This should help you debug your

    programs.

    Parrot Assembly Basics

    Parrot is a register-based virtual machine. There are an undetermined number of registers that do not need to be

    instantiated before they are called. The virtual machine will make certain to create registers as they are needed, and

    rearrange them as makes sense to do so. Register names are lexically scoped, so register "$P0" in one function is not

    necessarily the same data location as register "$P0" in another function.

    All registers start with a "$" sign. Following the "$", called the "sigil", there is a letter that denotes the data type of

    the register, followed by the register number. There are 4 types of data items, each with a unique register character

    identifier. These are:

    StringString registers start with an "S". String registers can be named things like "$S0" or "$S100".

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    23/170

    Parrot Assembly Language 20

    Integer

    Integer registers start with an "I". Integer registers can be named things like "$I0" or "$I56".

    Number

    Floating point number registers, registers which can hold a floating point number, start with a letter "N". These

    registers can be named things like "$N0" or "$N354".

    PMC

    PMCs are advanced object-oriented data types, and a PMC register can be used to hold many different kinds of

    data. PMC registers start with a "P" identifier, and can be named things like "$P0" or "$P35".

    Basic Statements

    A basic PASM statement contains an optional label, an instruction mnemonic, and a series of comma-separated

    arguments. Here is an example:

    my_label: add_n $P0, $P1, $I1

    In this example the add_n instruction performs addition on two registers and stores the result in a third. The values

    from $P1 and $I1 are added together, and the result is stored in $P0. Notice that the operands are different types. One

    of the arguments, and the result are both PMC registers, but the second operand is an integer and the add_n

    instruction is an integer instruction. Parrot will automatically handle data type conversions as necessary when

    performing instructions like this. The only thing that is required is that it is possible to convert between two data

    types. If it is possible, Parrot will handle the details. In some cases, however, automatic type conversions are not

    possible and in these cases Parrot will raise an exception.

    Directives

    PASM has few available directives..pcc_sub

    This directive defines the start of a new subroutine.

    Resources

    http:/ /www.parrotcode.org/docs/pdd/pdd06_pasm.html

    http://www.parrotcode.org/docs/pdd/pdd06_pasm.html
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    24/170

    Parrot Intermediate Representation 21

    Parrot Intermediate Representation

    Parrot Intermediate Representation

    The Parrot Intermediate Representation (PIR) is similar in many respects to the C programming language: It's

    higher-level than assembly language but it is still very close to the underlying machine. The benefit to using PIR isthat it's easier to program in than PASM, but at the same time it exposes all of the low-level functionality of Parrot.

    PIR has two purposes in the world of Parrot. The first is to be used as a target for automatic code generators from

    high-level languages. Compilers for high-level languages emit PIR code, which can then be interpreted and

    executed. The second purpose is to be a low-level human-readable programming language in which basic

    components and Parrot libraries can be written. In practice, PASM exists only as a human-readable direct translation

    of Parrot's bytecode, and is rarely used to program by humans directly. PIR is used almost exclusively to write

    low-level software for Parrot.

    PIR Syntax

    PIR syntax is similar in many respects to older programming languages such as C or BASIC. In addition to

    PASM-like operations, there are control structures and arithmetic operations which simplify the syntax for human

    readers. All PASM is legal PIR code, PIR is almost little more then an overlay of fancy syntax over the raw PASM

    instructions. When available, you should always use PIR's syntax instead of PASM's for ease.

    Even though PIR has more features and better syntax then PASM, it is not itself a high-level language. PIR is still

    very low-level and is not really intended for use building large systems. There are many other tools available to

    language and application designers on Parrot that PIR only really needs to be used in a small subset of areas.

    Eventually, enough tools might be created that PIR is never needed to be used directly.

    PIR and High-Level LanguagesPIR is designed to help implement higher-level languages such as Perl, TCL, Python, Ruby, and PHP. As we've

    discussed before, high-level languages (HLL) are related to PIR in two possible ways:

    1.1. We write a compiler for the HLL using the language NQP and the Parrot Compiler Tools (PCT). This compiler is

    then converted to PIR, and then to Parrot bytecode.

    2.2. We write code in the HLL and compile it. The compiler converts the code into a tree-like intermediate

    representation called PAST, to another representation called POST, and finally to PIR code. From here, the PIR

    can be interpreted directly, or else it can be further compiled to Parrot bytecode.

    PIR, therefore, has features that help to enable writing compilers, and it also has features that support the HLLs that

    are written using those compilers.

    Comments

    Similarly to Perl, PIR uses the "#" symbol to start comments. Comments run from the # until the end of the current

    line. PIR also allows the use of POD documentation in files. We'll talk about POD in more detail later.

    Subroutines

    Subroutines start with the .sub directive, and end with the .end directive. We can return values from a

    subroutine using the .return directive. Here is a short example of a function that takes no parameters and returns

    an approximation of :

    http://en.wikibooks.org/w/index.php?title=C_Programming
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    25/170

    Parrot Intermediate Representation 22

    .sub 'GetPi'

    $N0 = 3.14159

    .return($N0)

    .end

    Notice that the subroutine name is written in single quotes. This isn't a requirement, but it's very helpful and should

    be done whenever possible. We'll discuss the reasons for this below.

    Subroutine Calls

    There are two methods to call a subroutine: Direct and Indirect. In a direct call, we call a specific subroutine by

    name:

    $N1 = 'GetPi'()

    In an indirect call, however, we call a subroutine using a string that contains the name of that subroutine:

    $S0 = 'GetPi'

    $N1 = $S0()

    The problem arises when we start to use named variables (which we will discuss in more detail below). Consider the

    following snippet where we have a local variable called "GetPi":

    GetPi = 'MyOtherFunction'

    $N0 = GetPi()

    In this snippet here, do we call the function "GetPi" (since we made the call GetPi()) or do we call the function

    "MyOtherFunction" (since the variable GetPi contains the value 'MyOtherFunction')? The short answer is that we

    would call the function "MyOtherFunction" because local variable names take precidence over function names in

    these situations. However, this is a little confusing, isn't it? To avoid this confusion, there are some standards that

    people use to make this easier:

    $N0 = GetPi() Used only for indirect calls

    $N0 = 'GetPi'() Used for all direct calls

    By sticking with this convention, we avoid all possible confusions later on.

    Subroutine Parameters

    Parameters to a subroutine can be declared using the .param directive. Here are some examples:

    .sub 'MySub'.param int myint

    .param string mystring

    .param num mynum

    .param pmc mypmc

    In a parameter declaration, the .param directives must be at the top of the function. You may not put comments or

    other code between the .sub and .param directives. Here is the same example above:

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    26/170

    Parrot Intermediate Representation 23

    .sub 'MySub'

    # These are my params:

    .param int myint

    .param string mystring

    .param num mynum

    .param pmc mypmc

    Wrong!

    Named Parameters

    Parameters that are passed in a strict order like we've seen above are called positional arguments. Positional

    arguments are differentiated from one another by their position in the function call. Putting positional arguments in a

    different order will produce different effects, or may cause errors. Parrot supports a second type of parameter, a

    named parameter. Instead of passing parameters by their position in the string, parameters are passed by name and

    can be in any order. Here's an example:

    .sub 'MySub'

    .param int yrs :named("age")

    .param string call :named("name")

    $S0 = "Hello " . call

    $S1 = "You are " . yrs

    $S1 = $S1 . " years old

    print $S0

    print $S1

    .end

    .sub main :main

    'MySub'("age" => 42, "name" => "Bob")

    .end

    In the example above, we could have easily reversed the order too:

    .sub main :main

    'MySub'("name" => "Bob", "age" => 42) # Same!

    .end

    Named arguments can be a big help because you don't have to worry about the exact order of variables, especially as

    argument lists get very long.

    Optional ParametersFunctions may declare optional parameters, which the caller may or may not specify. To do this, we use the

    :optional and :opt_flag modifiers:

    .sub 'Foo'

    .param int bar :optional

    .param int has_bar :opt_flag

    In this example, the parameter has_bar will be set to 1 if bar was supplied by the caller, and will be 0

    otherwise. Here is some example code that takes two numbers and adds them together. If the second argument is not

    supplied, the first number is doubled:

    .sub 'AddTogether'

    .param num x

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    27/170

    Parrot Intermediate Representation 24

    .param num y :optional

    .param int has_y :opt_flag

    if has_y goto ive_got_y

    y = x

    ive_got_y:

    $N0 = x + y

    .return($N0)

    .end

    And we will call this function with

    'AddTogether'(1.0, 1.5) #returns 2.5

    'AddTogether'(3.0) #returns 6.0

    Slurpy Parameters

    A subroutine can take any number of arguments, which can be loaded into an array. Parameters which can accept avariable number of input arguments are called :slurpy parameters. Slurpy arguments are loaded into an array

    PMC, and you can loop over them inside your function if you wish. Here is a short example:

    .sub 'PrintList'

    .param list :slurpy

    print list

    .end

    .sub 'PrintOne'

    .param item

    print item

    .end

    .sub main :main

    PrintList(1, 2, 3) # Prints "1 2 3"

    PrintOne(1, 2, 3) # Prints "1"

    .end

    Slurpy parameters absorb the remainder of all function arguments. Therefore, slurpy parameters should only be the

    last argument to a function. Any parameters after a slurpy parameter will never take any values, because all

    arguments passed for them will get absorbed by the slurpy parameter instead.

    Flat Argument Arrays

    If you have an array PMC that contains data for a function, you can pass in the array PMC. The array itself will

    become a single parameter which will be loaded into a single array PMC in the function. However, if you use the

    :flat keyword when calling a function with an array, till will pass each element of the array into a different

    parameter. Here is an example function:

    .sub 'ExampleFunction'

    .param pmc a

    .param pmc b

    .param pmc c

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    28/170

    Parrot Intermediate Representation 25

    .param pmc d :slurpy

    We have an array calledx that contains three Integer PMCs: [1, 2, 3]. Here are two examples:

    Function Call 'ExampleFunction'(x, 4, 5) 'ExampleFunction'(x :flat, 4, 5)

    Parameters a = [1, 2, 3]

    b = 4 c = 5

    d = []

    a = 1

    b = 2 c = 3

    d = [4, 5]

    Variables

    Local Variables

    Local variables can be defined using the .local directive, using a similar syntax as is used with parameters:

    .local int myint

    .local string mystring

    .local num mynum

    .local pmc mypmc

    In addition to local variables, in PIR you can use the registers for data storage as well.

    Namespaces

    Namespaces are constructs that allow the reuse of function and variable names without causing conflicts with

    previous incarnations. Namespaces are also used to keep the methods of a class together, without causing naming

    conflicts with functions of the same names in other namespaces. They are a valuable tool in promoting code reuse

    and decreasing naming pollution.In PIR, namespaces are specified with the .namespace directive. Namespaces may be nested using a key

    structure:

    .namespace ["Foo"]

    .namespace ["Foo";"Bar"]

    .namespace ["Foo";"Bar";"Baz"]

    The root namespace can be specified with an empty pair of brackets:

    .namespace [] #Right! Enters the root namespace

    .namespace #WRONG! Brackets are required!

    Strings

    Strings are a fundamental datatype in PIR, and are incredibly flexible. Strings can be specified as quoted literals, or

    as "Heredoc" literals in the code.

    Heredocs

    Heredoc string literals have become a common tool in modern programming languages to specify very long

    multi-line string literals. Perl programmers will be familiar with them, but so will most shell programmers and even

    modern .NET programmers too. Here is how a Heredoc works in PIR:

    $S0 =

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    29/170

    Parrot Intermediate Representation 26

    This is part of the Heredoc string. Everything between the

    '

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    30/170

    Parrot Intermediate Representation 27

    $N0 = 3.14159

    .return($N0)

    .end

    .sub 'GetE' :method

    $N0 = 2.71828

    .return($N0)

    .end

    With this class (which we probably store in "MathConstants.pir" and include into our main file), we can write the

    following things:

    .local pmc mathconst

    mathconst = new 'MathConstants'

    $N0 = mathconst.'GetPi'() #$N0 contains the value 3.14159

    $N1 = mathconst.'GetE'() #$N1 contains the value 2.71828

    We'll explain more of the messy details later, but this should be enough to get you started.

    Control Statements

    PIR is a low-level language and so it doesn't support any of the high-level control structures that programmers may

    be used to. PIR supports two types of control structures: conditional and unconditional branches.

    Unconditional Branches are handled by the goto instruction.

    Conditional Branches use the goto command also, but accompany it with an ifor unless statement. The jump is

    only taken if the if-condition is true or the unless-condition is false.

    HLL Namespace

    Each HLL compiler has a namespace that is the same as the name of that HLL. For instance, if we were

    programming a compiler for Perl, we would create the namespace .namespace ["Perl"]. If we are not writing

    a compiler, but instead writing a program in pure PIR, we would be in the default namespace .namespace

    ["Parrot"]. To create a new HLL compiler, we would use the .HLL directive to create the current default HLL

    namespace:

    .HLL "mylanguage", "mylanguage_group"

    Everything that is in the HLL namespace is visible to programs written in that HLL. For example, if we have a PIR

    function "Foo" that is in the "PHP" namespace, a program written in PHP can call the Foo function as if it were aregular PHP function. This may sound a little bit complicated. Here is a short example:

    PIR Code Perl 6 code

    .namespace ["perl6"]

    .sub 'AddTwo'

    .param int a

    .param int b

    $I0 = a + b

    .return($I0)

    .end

    $x = AddTwo(4 + 5);

    To simplify, we can write simply .namespace (without the brackets) to return to the current HLL namespace.

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    31/170

    Parrot Intermediate Representation 28

    Multimethods

    Multimethods are groups of subroutines which share the same name. For instance, the subroutine "Add" might have

    different behavior depending on whether it is passed a Perl 5 Floating point value, a Parrot BigNum PMC, or a Lisp

    Ratio. Multiple dispatch subroutines are declared like any other subroutine in PIR, except they also have the

    :multi flag. When a Multi is invoked, Parrot loads the MultiSub PMC object with the same name, and starts to

    compare parameters. Whichever subroutine has the best match to the accepted parameter list gets invoked. The "bestmatch" routine is relatively advanced. Parrot uses a Manhattan distance to order subroutines by their closeness to the

    given list, and then invokes the subroutine at the top of the list.

    When sorting, Parrot takes into account roles and multiple inheritance. This makes it incredibly powerful and

    versatile.

    MultiMethods, MultiSubs, and other key words

    The vocabulary on this page might start to get a little bit complicated. Here, we will list a few terms which are used

    to describe things in Parrot.

    Subroutine

    A basic block of code with a name and a parameter list.

    Method

    A basic block of code which belongs to a particular class and can be called on an object of that class. Methods

    are just subroutines with an extra implicit self parameter.

    Multi Dispatch

    Where multiple subroutines have the same name, and Parrot selects the best one to invoke.

    Single Dispatch

    Where there is only one subroutine with the given name, and Parrot does not need to do any fancy sorting or

    selecting.

    MultiSub

    a PMC type that stores a collection of subroutines which can be invoked by name and sorted/searched by

    Parrot.

    MultiMethod

    Same as a MultiSub, except it is called as a method instead of a subroutine.

    PIR Macros and Constants

    PIR allows a text-replacement macro functionality, similar in concept (but not in implementation) to those used byC's preprocessor. PIR does not have preprocessor directives that support conditional compilation.

    Macro Constants

    Constant values can be defined with the .macro_const keyword. Here is an example:

    .macro_const PI 3.14

    .sub main :main

    print .PI #Prints "3.14"

    .end

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    32/170

    Parrot Intermediate Representation 29

    A .macro_const can be an integer constant, a floating point constant, a string literal, or a register name. Here's

    another example:

    .macro_const MyReg S0

    .macro_const HelloMessage "hello world!"

    .sub main :main

    .MyReg = .HelloMessage

    print .MyReg

    .end

    This allows you to give names to common constants, strings, or registers.

    Macros

    Basic text-substitution macros can be created using the .macro and .endm keywords to mark the start and end of

    the macro respectively. Here is a quick example:

    .macro SayHello

    print "Hello!"

    .endm

    .sub main :main

    .SayHello

    .SayHello

    .SayHello

    .end

    This example, as should be obvious, prints out the word "Hello!" three times. We can also give our macrosparameters, to be included in the text substitution:

    .macro CircleCircumference(r)

    $N0 = r * 3.1.4

    $N0 = $N0 * 2

    print $N0

    .endm

    .sub main :main

    .CircleCircumference(5)

    .CircleCircumference(10)

    .end

    Macro Local Variables

    What if we want to define a temporary variable inside the macro? Here's an idea:

    .macro PrintSomething

    .local string something

    something = "This is a message"

    print something

    .endm

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    33/170

    Parrot Intermediate Representation 30

    .sub main :main

    .PrintSomething

    .PrintSomething

    .end

    After we do the text substitution, we get this:

    .sub main :main

    .local string something

    something = "This is a message"

    print something

    .local string something

    something = "This is a message"

    print something

    .end

    After the substitution, we've declared the variable something twice! Instead of that, we can use the

    .macro_local declaration to create a variable with a unique name that is local to the macro:

    .macro PrintSomething

    .macro_local something

    something = "This is a message"

    print something

    Now, the same function translates to this after the text substitution:

    .sub main :main

    .local string main_PrintSomething_something_1

    main_PrintSomething_something_1 = "This is a message"print main_PrintSomething_something_1

    .local string main_PrintSomething_something_2

    main_PrintSomething_something_2 = "This is a message"

    print main_PrintSomething_something_2

    .end

    Notice how the local variable declarations now are unique? They depend on the name of the parameter, the name of

    the macro, and other information from the file? This is a reusable approach that doesn't cause any problems.

    Resources http:/ /docs.parrot.org/parrot/latest/html/docs/pdds/pdd19_pir.pod.html

    http://docs.parrot.org/parrot/latest/html/docs/pdds/pdd19_pir.pod.html
  • 8/22/2019 Parrot-Virtual-Machine.pdf

    34/170

    Parrot Magic Cookies 31

    Parrot Magic Cookies

    Polymorphic Containers (PMCs)

    Polymorphic Containers (PMCs) -- which were previously known as 'Parrot Magic Cookies' -- are one of the

    fundamental data types of Parrot, and are one of the most powerful and flexible data types available. A PMC is verymuch like a class object, with data storage and associated class methods. PMCs include all aggregate data types

    including arrays, associative arrays (Hashes), Exceptions, Structures, and Objects. Parrot comes with a core set of

    PMCs, but new PMCs can be added for use with specific programs or languages.

    PMCs are written in a C-like language that we will call "PMC Script" and compiled. PMCs can be built-in to Parrot

    directly, or they can be written separately and loaded in later. PMCs which are loaded at runtime are called "dynamic

    PMCs", or DYNPMCs for short.

    Writing PMCs in C

    PMC definitions are written in a C-like language that is translated to C code using a special PMC compiler programcalled pmc2c.pl. Once converted to C code, the PMCs are included in the Parrot build process.

    The PMC Compiler

    The PMC Compiler, pmc2c.pl has a number of tasks to perform. It converts the PMC into legal C syntax, inserts

    the function names in the appropriate tables, and exports information about the PMC and its methods to the rest of

    the Parrot system.

    PMC Script

    The script language used to write a PMC is based on C. In fact, it's mostly C with a few additional keywords and

    constructs. The PMC compiler converts PMC files into C code for compilation. All standard ANSI C 89 code is

    acceptable for use in PMC files. Here we will list some of the additions.

    PMC Class Definition

    All the methods and vtables of the PMC must be enclosed in a PMC class declaration:

    pmclass NAME {

    }

    In addition to just giving the name of the PMC, you can specify single-inheritance too:

    pmclass NAME is SUPERNAME { }

    Where SUPERNAME is the name of the parent PMC class. In your PMC vtable methods you can use the SUPER

    keyword to access the vtable methods of the parent class.

    You can also allocate an additional storage area called a PMC_EXT using the needs_ext keyword. PMC_EXT is an

    additional structure that can be allocated to help with special operations, such as sharing between multiple

    interpreters. If the PMC is not automatically thread safe, you should add a PMC_EXT.

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    35/170

    Parrot Magic Cookies 32

    Specifier Meaning

    is SUPERNAME Specifies the parent class, if any

    need_ext Needs a PMC_EXT for special handling

    abstract The class is abstract and cannot be instantiated

    no_init The PMC does not have an init vtable method for Parrot to call. Normally, Parrot calls the init method when the PMC is f irstcreated. if you don't need that, use no_init.

    provides

    INTERFACE

    INTERFACE is one of the standard interfaces, and the PMC can be used as if it were an object of that type. The interfaces

    are "array", "hash"

    Helper Functions

    Like ordinary C, you can define addtional functions to help with your calculations. These functions should be written

    in ordinary C (without any special keywords or values) and should be defined outside of the C definition.

    Defining PMC AttributesPMCs can be given a custom set of data field attributes using the ATTR keyword. ATTR allows the PMC to be

    extended to contain custom data structures that are automatically managed by Parrot's memory subsystem. Here's an

    example:

    pmclass Foo {

    ATTR INTVAL bar;

    ATTR PMC baz;

    ...

    }

    The attributes are stored in a custom data structure that can be accessed using a macro with the same name as the

    PMC, but all upper-case:

    Parrot_Foo_attributes * attrs = PARROT_FOO(SELF);

    attrs->bar = 7; /* it's an INTVAL */

    attrs->baz = pmc_new( ... ) /* It's a PMC */

    Notice how the type name of the attributes structure is Parrot_, followed by the name of the PMC with the same

    capitalzation as is used in the pmclass definition, followed by _attributes. The macro to return this structure

    is PARROT_ followed by the name of the PMC in all caps.

    VTABLEs and Methods

    Note:

    The VTABLE interface, and the specific functions in a vtable are subject to change before the Parrot 1.0 release.

    PMCs can supply definitions for any number of VTABLE interfaces. Any interfaces not defined will fall back to a

    default implementation which throws an error. VTABLE interfaces must all follow a pre-defined format, and

    attemping to define a VTABLE interface that is not one of the normal interfaces or does not use the same parameter

    list and return value as the normal interfaces will throw an error.

    The parameters for all VTABLE and METHOD declarations may be either INTVAL, FLOATVAL, STRING, or

    PMC, as these are the only values which can be passed from PIR code. VTABLE Interfaces are defined with the

    VTABLE keyword, and Methods on the PMC can be defined with the METHOD keyword.

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    36/170

    Parrot Magic Cookies 33

    VTABLES

    All PMCs have a standard API, an interface that they share in common with all other PMCs. This standard interface

    is called a VTABLE. A VTABLE is a list of about 150 standard functions, called "VTABLE Interfaces" that

    implement basic, common, behavior for PMCs. All PMCs implement all these interfaces, although if one is not

    explicitly provided it can inherit from a parent PMC class, or it can default to throwing an exception.

    VTABLE methods can be defined in one of two ways, in the .pmc using the C-like PMC language, or in PIR using

    the :vtable function qualifier. VTABLEs correspond to some basic operations that can be performed on any

    object, such as arithmetic, class operations, casting operations (to INTVAL, FLOATVAL, STRING, or PMC), and

    other common operations. Regardless of how the VTABLE method is defined, they must have very specific names.

    Writing VTABLE Interfaces

    VTABLE functions all have fixed names and parameter lists. When implementing a new VTABLE method, you

    must strictly conform to this, or there could be several compilation errors and warnings. For a list of all vtable

    methods and their expected function signatures, you can check out the header file

    /include/parrot/vtables.h .

    Inside a VTABLE method there are several available keywords that can be used:

    SELF

    the current PMC

    INTERP

    the parrot interpreter

    SUPER

    The parent PMC class.

    You can also reference other methods or vtable methods of the current PMC using a standard dot notation such as:

    SELF.VTABLE_OR_METHOD_NAME()

    If you want to default all or part of your processing to the super class (if you have a superclass), you can use the

    SUPER() function to do that. Any vtable method that you do not implement will be automatically defaulted to the

    super class (if any) or to te default parent class.

    Methods

    In addition to VTABLEs, a PMC may supply a series of custom interface functions called methods to supply

    additional functionality. Notice that methods are not integrated into the PIR operators or PASM opcodes in the same

    way that VTABLE methods are. Methods can be written in the C-like PMC script for individual PMCs, or they can

    be written in PIR for user-defined PMC subclasses.

    Invoking Methods

    Once a method has been defined, it can be accessed in a PMC file using the PCCINVOKE command.

  • 8/22/2019 Parrot-Virtual-Machine.pdf

    37/170

    Parrot Magic Cook


Recommended