Class TaskJuggler::TextParser::Scanner
In: lib/taskjuggler/TextParser/Scanner.rb
Parent: Object

The Scanner class is an abstract text scanner with support for nested include files and text macros. The tokenizer will operate on rules that must be provided by a derived class. The scanner is modal. Each mode operates only with the subset of token patterns that are assigned to the current mode. The current line is tracked accurately and can be used for error reporting. The scanner can operate on Strings or Files.

Methods

Classes and Modules

Class TaskJuggler::TextParser::Scanner::BufferStreamHandle
Class TaskJuggler::TextParser::Scanner::FileStreamHandle
Class TaskJuggler::TextParser::Scanner::MacroStackEntry
Class TaskJuggler::TextParser::Scanner::StreamHandle

Attributes

mrxs  [R] 

Public Class methods

Create a new instance of Scanner. masterFile must be a String that either contains the name of the file to start with or the text itself. messageHandler is a MessageHandler that is used for error messages. log is a Log to report progress and status.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 247
    def initialize(masterFile, log, tokenPatterns, defaultMode)
      @masterFile = masterFile
      @messageHandler = TaskJuggler::MessageHandlerInstance.instance
      @log = log
      # This table contains all macros that may be expanded when found in the
      # text.
      @macroTable = MacroTable.new
      # The currently processed IO object.
      @cf = nil
      # This Array stores the currently processed nested files. It's an Array
      # of Arrays. The nested Array consists of 2 elements, the IO object and
      # the @tokenBuffer.
      @fileStack = []
      # This flag is set if we have reached the end of a file. Since we will
      # only know when the next new token is requested that the file is really
      # done now, we have to use this flag.
      @finishLastFile = false
      # True if the scanner operates on a buffer.
      @fileNameIsBuffer = false
      # A SourceFileInfo of the start of the currently processed token.
      @startOfToken = nil
      # Line number correction for error messages.
      @lineDelta = 0
      # Lists of regexps that describe the detectable tokens. The Arrays are
      # grouped by mode.
      @patternsByMode = { }
      # The currently active scanner mode.
      @scannerMode = nil
      # The mode that the scanner is in at the start and end of file
      @defaultMode = defaultMode
      # Points to the currently active pattern set as defined by the mode.
      @activePatterns = nil

      @mrxs = MRXScanner.new

      tokenPatterns.each do |pat|
        type = pat[0]
        regExp = pat[1]
        mode = pat[3] || :tjp
        postProc = pat[4]
        addPattern(type, Regexp.new(regExp), mode, postProc)
        @mrxs.addRegExp(regExp, type, postProc, mode)
      end
      self.mode = defaultMode
    end

Public Instance methods

Add a Macro to the macro translation table.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 480
    def addMacro(macro)
      @macroTable.add(macro)
    end

Add a new pattern to the scanner. type is either nil for tokens that will be ignored, or some identifier that will be returned with each token of this type. regExp is the RegExp that describes the token. mode identifies the scanner mode where the pattern is active. If it‘s only a single mode, mode specifies the mode directly. For multiple modes, it‘s an Array of modes. postProc is a method reference. This method is called after the token has been detected. The method gets the type and the matching String and returns them again in an Array.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 301
    def addPattern(type, regExp, mode, postProc = nil)
      if mode.is_a?(Array)
        mode.each do |m|
          # The pattern is active in multiple modes
          @patternsByMode[m] = [] unless @patternsByMode.include?(m)
          @patternsByMode[m] << [ type, regExp, postProc ]
        end
      else
        # The pattern is only active in one specific mode.
        @patternsByMode[mode] = [] unless @patternsByMode.include?(mode)
        @patternsByMode[mode] << [ type, regExp, postProc ]
      end
    end

Finish processing and reset all data structures.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 345
    def close
      unless @fileNameIsBuffer
        @log.startProgressMeter("Reading file #{@masterFile}")
        @log.stopProgressMeter
      end
      @fileStack = []
      @cf = @tokenBuffer = nil
    end

Call this function to report any errors related to the parsed input.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 510
    def error(id, text, sfi = nil, data = nil)
      message(:error, id, text, sfi, data)
    end

Expand a macro and inject it into the input stream. prefix is any string that was found right before the macro call. We have to inject it before the expanded macro. args is an Array of Strings. The first is the macro name, the rest are the parameters. callLength is the number of characters for the complete macro call "${…}".

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 494
    def expandMacro(prefix, args, callLength)
      # Get the expanded macro from the @macroTable.
      macro, text = @macroTable.resolve(args, sourceFileInfo)
      unless macro && text
        error('undefined_macro', "Undefined macro '#{args[0]}' called")
      end

      # If the expanded macro is empty, we can ignore it.
      return if text == ''

      unless @cf.injectMacro(macro, args, prefix + text, callLength)
        error('macro_stack_overflow', "Too many nested macro calls.")
      end
    end

Return the name of the currently processed file. If we are working on a text buffer, the text will be returned.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 402
    def fileName
      @cf ? @cf.fileName : @masterFile
    end

Continue processing with a new file specified by includeFileName. When this file is finished, we will continue in the old file after the location where we started with the new file. The method returns the full qualified name of the included file.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 358
    def include(includeFileName, sfi, &block)
      if includeFileName[0] != '/'
        pathOfCallingFile = @fileStack.last[0].dirname
        path = pathOfCallingFile.empty? ? '' : pathOfCallingFile + '/'
        # If the included file is not an absolute name, we interpret the file
        # name relative to the including file.
        includeFileName = path + includeFileName
      end

      # Try to dectect recursive inclusions. This will not work if files are
      # accessed via filesystem links.
      @fileStack.each do |entry|
        if includeFileName == entry[0].fileName
          error('include_recursion',
                "Recursive inclusion of #{includeFileName} detected", sfi)
        end
      end

      # Save @tokenBuffer in the record of the parent file.
      @fileStack.last[1] = @tokenBuffer unless @fileStack.empty?
      @tokenBuffer = nil
      @finishLastFile = false

      # Open the new file and push the handle on the @fileStack.
      begin
        @fileStack << [ (@cf = FileStreamHandle.new(includeFileName, @log,
                                                    self)), nil, block ]
        @log.msg { "Parsing file #{includeFileName}" }
      rescue StandardError
        error('bad_include', "Cannot open include file #{includeFileName}", sfi)
      end

      # Return the name of the included file.
      includeFileName
    end

Return true if the Macro name has been added already.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 485
    def macroDefined?(name)
      @macroTable.include?(name)
    end

Switch the parser to another mode. The scanner will then only detect patterns of that newMode.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 317
    def mode=(newMode)
      #puts "**** New mode: #{newMode}"
      @activePatterns = @patternsByMode[newMode]
      raise "Undefined mode #{newMode}" unless @activePatterns
      @scannerMode = newMode
    end

Return the next token from the input stream. The result is an Array with 3 entries: the token type, the token String and the SourceFileInfo where the token started.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 421
    def nextToken
      # If we have a pushed-back token, return that first.
      unless @tokenBuffer.nil?
        res = @tokenBuffer
        @tokenBuffer = nil
        return res
      end

      if @finishLastFile
        # The previously processed file has now really been processed to
        # completion. Close it and remove the corresponding entry from the
        # @fileStack.
        @finishLastFile = false
        #@log.msg { "Completed file #{@cf.fileName}" }

        # If we have a block to be executed on EOF, we call it now.
        onEof = @fileStack.last[2]
        onEof.call if onEof

        @cf.close if @cf
        @fileStack.pop

        if @fileStack.empty?
          # We are done with the top-level file now.
          @cf = @tokenBuffer = nil
          @finishLastFile = true
          return [ :endOfText, '<EOT>', @startOfToken ]
        else
          # Continue parsing the file that included the current file.
          @cf, tokenBuffer = @fileStack.last
          @log.msg { "Parsing file #{@cf.fileName} ..." }
          # If we have a left over token from previously processing this file,
          # return it now.
          if tokenBuffer
            @finishLastFile = true if tokenBuffer[0] == :eof
            return tokenBuffer
          end
        end
      end

      # Start processing characters from the input.
      # (Un)-comment to toggle between StringScanner (scanToken) and
      # MRXScanner (mrxScanToken).
      scanToken
      # mrxScanToken
    end

Start the processing. if fileNameIsBuffer is true, we operate on a String, else on a File.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 327
    def open(fileNameIsBuffer = false)
      @fileNameIsBuffer = fileNameIsBuffer
      if fileNameIsBuffer
        @fileStack = [ [ @cf = BufferStreamHandle.new(@masterFile, @log, self),
                         nil, nil ] ]
      else
        begin
          @fileStack = [ [ @cf = FileStreamHandle.new(@masterFile, @log, self),
                           nil, nil ] ]
        rescue IOError, SystemCallError
          error('open_file', "Cannot open file #{@masterFile}: #{$!}")
        end
      end
      @masterPath = @cf.dirname + '/'
      @tokenBuffer = nil
    end

Return a token to retrieve it with the next nextToken() call again. Only 1 token can be returned before the next nextToken() call.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 470
    def returnToken(token)
      #@log.msg { "-> Returning Token: [#{token[0]}][#{token[1]}]" }
      unless @tokenBuffer.nil?
        $stderr.puts @tokenBuffer
        raise "Fatal Error: Cannot return more than 1 token in a row"
      end
      @tokenBuffer = token
    end

Return SourceFileInfo for the current processing prosition.

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 395
    def sourceFileInfo
      @cf ? SourceFileInfo.new(fileName, @cf.lineNo - @lineDelta, 0) :
            SourceFileInfo.new(@masterFile, 0, 0)
    end

[Source]

# File lib/taskjuggler/TextParser/Scanner.rb, line 514
    def warning(id, text, sfi = nil, data = nil)
      message(:warning, id, text, sfi, data)
    end

[Validate]