reklama - zainteresowany?

Perl & LWP - Helion

Perl & LWP
ebook
Autor: Sean M. Burke
ISBN: 978-05-965-5209-1
stron: 262, Format: ebook
Data wydania: 2002-06-20
Księgarnia: Helion

Cena książki: 118,15 zł (poprzednio: 137,38 zł)
Oszczędzasz: 14% (-19,23 zł)

Dodaj do koszyka Perl & LWP

Tagi: Perl - Programowanie | Perl/CGI - Programowanie

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages.The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.Perl & LWP covers:

  • Understanding LWP and its design
  • Fetching and analyzing URLs
  • Extracting information from HTML using regular expressions and tokens
  • Working with the structure of HTML documents using trees
  • Setting and inspecting HTTP headers and response codes
  • Managing cookies
  • Accessing information that requires authentication
  • Extracting links
  • Cooperating with proxy caches
  • Writing web spiders (also known as robots) in a safe fashion
Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work.Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerful and popular toolkit.

Dodaj do koszyka Perl & LWP

 

Osoby które kupowały "Perl & LWP", wybierały także:

  • Perl. Mistrzostwo w programowaniu
  • Wielkie umysÅ‚y programowania. Jak myÅ›lÄ… i pracujÄ… twórcy najważniejszych jÄ™zyków
  • Learning Perl. Making Easy Things Easy and Hard Things Possible. 7th Edition
  • 100 sposobów na Perl
  • Mastering Perl. 2nd Edition

Dodaj do koszyka Perl & LWP

Spis treści

Perl & LWP eBook -- spis treści

  • Perl & LWP
    • SPECIAL OFFER: Upgrade this ebook with OReilly
    • A Note Regarding Supplemental Files
    • Foreword
    • Preface
      • Audience for This Book
      • Structure of This Book
      • Order of Chapters
      • Important Standards Documents
      • Conventions Used in This Book
      • Comments & Questions
      • Acknowledgments
    • 1. Introduction to Web Automation
      • 1.1. The Web as Data Source
        • 1.1.1. Screen Scraping
        • 1.1.2. Brittleness
        • 1.1.3. Web Services
      • 1.2. History of LWP
      • 1.3. Installing LWP
        • 1.3.1. Installing LWP from the CPAN Shell
          • 1.3.1.1. Configuring
          • 1.3.1.2. Obtaining help
          • 1.3.1.3. Installing LWP
        • 1.3.2. Installing LWP Manually
          • 1.3.2.1. Download distributions
          • 1.3.2.2. Unpack and configure
          • 1.3.2.3. Make, test, and install
      • 1.4. Words of Caution
        • 1.4.1. Network and Server Load
        • 1.4.2. Copyright
        • 1.4.3. Acceptable Use
      • 1.5. LWP in Action
        • 1.5.1. The Object-Oriented Interface
        • 1.5.2. Forms
        • 1.5.3. Parsing HTML
        • 1.5.4. Authentication
    • 2. Web Basics
      • 2.1. URLs
      • 2.2. An HTTP Transaction
        • 2.2.1. Request
        • 2.2.2. Response
      • 2.3. LWP::Simple
        • 2.3.1. Basic Document Fetch
        • 2.3.2. Fetch and Store
        • 2.3.3. Fetch and Print
        • 2.3.4. Previewing with HEAD
      • 2.4. Fetching Documents Without LWP::Simple
      • 2.5. Example: AltaVista
      • 2.6. HTTP POST
      • 2.7. Example: Babelfish
    • 3. The LWP Class Model
      • 3.1. The Basic Classes
      • 3.2. Programming with LWP Classes
      • 3.3. Inside the do_GET and do_POST Functions
      • 3.4. User Agents
        • 3.4.1. Connection Parameters
        • 3.4.2. Request Parameters
        • 3.4.3. Protocols
        • 3.4.4. Redirection
        • 3.4.5. Authentication
        • 3.4.6. Proxies
        • 3.4.7. Request Methods
          • 3.4.7.1. Saving response content to a file
          • 3.4.7.2. Sending response content to a callback
          • 3.4.7.3. Mirroring a URL to a file
        • 3.4.8. Advanced Methods
      • 3.5. HTTP::Response Objects
        • 3.5.1. Status Line
        • 3.5.2. Content
        • 3.5.3. Headers
        • 3.5.4. Expiration Times
        • 3.5.5. Base for Relative URLs
        • 3.5.6. Debugging
      • 3.6. LWP Classes: Behind the Scenes
    • 4. URLs
      • 4.1. Parsing URLs
        • 4.1.1. Constructors
        • 4.1.2. Output
        • 4.1.3. Comparison
        • 4.1.4. Components of a URL
        • 4.1.5. Queries
      • 4.2. Relative URLs
      • 4.3. Converting Absolute URLs to Relative
      • 4.4. Converting Relative URLs to Absolute
    • 5. Forms
      • 5.1. Elements of an HTML Form
      • 5.2. LWP and GET Requests
        • 5.2.1. GETting Fixed URLs
        • 5.2.2. GETting a query_form( ) URL
      • 5.3. Automating Form Analysis
      • 5.4. Idiosyncrasies of HTML Forms
        • 5.4.1. Hidden Elements
        • 5.4.2. Text Elements
        • 5.4.3. Password Elements
        • 5.4.4. Checkboxes
        • 5.4.5. Radio Buttons
        • 5.4.6. Submit Buttons
        • 5.4.7. Image Buttons
        • 5.4.8. Reset Buttons
        • 5.4.9. File Selection Elements
        • 5.4.10. Textarea Elements
        • 5.4.11. Select Elements and Option Elements
      • 5.5. POST Example: License Plates
        • 5.5.1. The Form
        • 5.5.2. Use formpairs.pl
        • 5.5.3. Translating This into LWP
      • 5.6. POST Example: ABEBooks.com
        • 5.6.1. The Form
        • 5.6.2. Translating This into LWP
        • 5.6.3. Adding Features
        • 5.6.4. Generalizing the Program
      • 5.7. File Uploads
      • 5.8. Limits on Forms
    • 6. Simple HTML Processing with Regular Expressions
      • 6.1. Automating Data Extraction
      • 6.2. Regular Expression Techniques
        • 6.2.1. Anchor Your Match
        • 6.2.2. Whitespace
        • 6.2.3. Embedded Newlines
        • 6.2.4. Minimal and Greedy Matches
        • 6.2.5. Capture
        • 6.2.6. Repeated Matches
        • 6.2.7. Develop from Components
        • 6.2.8. Use Multiple Steps
      • 6.3. Troubleshooting
      • 6.4. When Regular Expressions Arent Enough
      • 6.5. Example: Extracting Linksfrom a Bookmark File
      • 6.6. Example: Extracting Linksfrom Arbitrary HTML
      • 6.7. Example: Extracting Temperatures from Weather Underground
    • 7. HTML Processing with Tokens
      • 7.1. HTML as Tokens
      • 7.2. Basic HTML::TokeParser Use
        • 7.2.1. Start-Tag Tokens
        • 7.2.2. End-Tag Tokens
        • 7.2.3. Text Tokens
        • 7.2.4. Comment Tokens
        • 7.2.5. Markup Declaration Tokens
        • 7.2.6. Processing Instruction Tokens
      • 7.3. Individual Tokens
        • 7.3.1. Checking Image Tags
        • 7.3.2. HTML Filters
      • 7.4. Token Sequences
        • 7.4.1. Example: BBC Headlines
        • 7.4.2. Translating the Problem into Code
        • 7.4.3. Bundling into a Program
      • 7.5. More HTML::TokeParser Methods
        • 7.5.1. The get_text( ) Method
        • 7.5.2. The get_text( ) Method with Parameters
        • 7.5.3. The get_trimmed_text( ) Method
        • 7.5.4. The get_tag( ) Method
          • 7.5.4.1. Start-tags
          • 7.5.4.2. End-tags
        • 7.5.5. The get_tag( ) Method with Parameters
      • 7.6. Using Extracted Text
    • 8. Tokenizing Walkthrough
      • 8.1. The Problem
      • 8.2. Getting the Data
      • 8.3. Inspecting the HTML
      • 8.4. First Code
      • 8.5. Narrowing In
      • 8.6. Rewrite for Features
        • 8.6.1. Debuggability
        • 8.6.2. Images and Applets
        • 8.6.3. Link Text
        • 8.6.4. Live Data
      • 8.7. Alternatives
    • 9. HTML Processing with Trees
      • 9.1. Introduction to Trees
      • 9.2. HTML::TreeBuilder
        • 9.2.1. Constructors
        • 9.2.2. Parse Options
        • 9.2.3. Parsing
        • 9.2.4. Cleanup
      • 9.3. Processing
        • 9.3.1. Methods for Searching the Tree
        • 9.3.2. Attributes of a Node
        • 9.3.3. Traversing
      • 9.4. Example: BBC News
      • 9.5. Example: Fresh Air
    • 10. Modifying HTML with Trees
      • 10.1. Changing Attributes
        • 10.1.1. Whitespace
        • 10.1.2. Other HTML Options
      • 10.2. Deleting Images
      • 10.3. Detaching and Reattaching
        • 10.3.1. The detach_content( ) Method
        • 10.3.2. Constraints
      • 10.4. Attaching in Another Tree
        • 10.4.1. Retaining Comments
        • 10.4.2. Accessing Comments
        • 10.4.3. Attaching Content
      • 10.5. Creating New Elements
        • 10.5.1. Literals
        • 10.5.2. New Nodes from Lists
    • 11. Cookies, Authentication,and Advanced Requests
      • 11.1. Cookies
        • 11.1.1. Enabling Cookies
        • 11.1.2. Loading Cookies from a File
        • 11.1.3. Saving Cookies to a File
        • 11.1.4. Cookies and the New York Times Site
      • 11.2. Adding Extra Request Header Lines
        • 11.2.1. Pretending to Be Netscape
        • 11.2.2. Referer
      • 11.3. Authentication
        • 11.3.1. Comparing Cookies with Basic Authentication
        • 11.3.2. Authenticating via LWP
        • 11.3.3. Security
      • 11.4. An HTTP Authentication Example:The Unicode Mailing Archive
    • 12. Spiders
      • 12.1. Types of Web-Querying Programs
      • 12.2. A User Agent for Robots
      • 12.3. Example: A Link-Checking Spider
        • 12.3.1. The Basic Spider Logic
        • 12.3.2. Overall Design in the Spider
        • 12.3.3. HEAD Response Processing
        • 12.3.4. Redirects
        • 12.3.5. Link Extraction
        • 12.3.6. Fleshing Out the URL Scheduling
        • 12.3.7. The Rest of the Code
      • 12.4. Ideas for Further Expansion
    • A. LWP Modules
    • B. HTTP Status Codes
      • B.1. 100s: Informational
      • B.2. 200s: Successful
      • B.3. 300s: Redirection
      • B.4. 400s: Client Errors
      • B.5. 500s: Server Errors
    • C. Common MIME Types
    • D. Language Tags
    • E. Common Content Encodings
    • F. ASCII Table
    • G. User's View of Object-Oriented Modules
      • G.1. A User's View of Object-Oriented Modules
      • G.2. Modules and Their Functional Interfaces
      • G.3. Modules with Object-Oriented Interfaces
      • G.4. What Can You Do with Objects?
      • G.5. What's in an Object?
      • G.6. What Is an Object Value?
      • G.7. So Why Do Some Modules Use Objects?
      • G.8. The Gory Details
    • Index
    • Colophon
    • SPECIAL OFFER: Upgrade this ebook with OReilly

Dodaj do koszyka Perl & LWP

Code, Publish & WebDesing by CATALIST.com.pl



(c) 2005-2024 CATALIST agencja interaktywna, znaki firmowe należą do wydawnictwa Helion S.A.