Class TaskJuggler::TextScanner
In: lib/TextScanner.rb
Parent: Object

The TextScanner class is an abstract text scanner with support for nested include files and text macros. The tokenizer will operate on rules that must be provided by a derived class. The scanner is modal. Each mode operates only with the subset of token patterns that are assigned to the current mode. The current line is tracked accurately and can be used for error reporting. The scanner can operate on Strings or Files.

Methods

Classes and Modules

Class TaskJuggler::TextScanner::BufferStreamHandle
Class TaskJuggler::TextScanner::FileStreamHandle
Class TaskJuggler::TextScanner::MacroStackEntry
Class TaskJuggler::TextScanner::StreamHandle

Public Class methods

Create a new instance of TextScanner. masterFile must be a String that either contains the name of the file to start with or the text itself. messageHandler is a MessageHandler that is used for error messages.

[Source]

     # File lib/TextScanner.rb, line 189
189:     def initialize(masterFile, messageHandler, tokenPatterns, defaultMode)
190:       @masterFile = masterFile
191:       @messageHandler = messageHandler
192:       # This table contains all macros that may be expanded when found in the
193:       # text.
194:       @macroTable = MacroTable.new(messageHandler)
195:       # The currently processed IO object.
196:       @cf = nil
197:       # This Array stores the currently processed nested files. It's an Array
198:       # of Arrays. The nested Array consists of 2 elements, the IO object and
199:       # the @tokenBuffer.
200:       @fileStack = []
201:       # This flag is set if we have reached the end of a file. Since we will
202:       # only know when the next new token is requested that the file is really
203:       # done now, we have to use this flag.
204:       @finishLastFile = false
205:       # True if the scanner operates on a buffer.
206:       @fileNameIsBuffer = false
207:       # A SourceFileInfo of the start of the currently processed token.
208:       @startOfToken = nil
209:       # Line number correction for error messages.
210:       @lineDelta = 0
211:       # Lists of regexps that describe the detectable tokens. The Arrays are
212:       # grouped by mode.
213:       @patternsByMode = { }
214:       # The currently active scanner mode.
215:       @scannerMode = nil
216:       # Points to the currently active pattern set as defined by the mode.
217:       @activePatterns = nil
218: 
219:       tokenPatterns.each do |pat|
220:         type = pat[0]
221:         regExp = pat[1]
222:         mode = pat[2] || :tjp
223:         postProc = pat[3]
224:         addPattern(type, regExp, mode, postProc)
225:       end
226:       self.mode = defaultMode
227:     end

Public Instance methods

Add a Macro to the macro translation table.

[Source]

     # File lib/TextScanner.rb, line 447
447:     def addMacro(macro)
448:       @macroTable.add(macro)
449:     end

Add a new pattern to the scanner. type is either nil for tokens that will be ignored, or some identifier that will be returned with each token of this type. regExp is the RegExp that describes the token. mode identifies the scanner mode where the pattern is active. If it‘s only a single mode, mode specifies the mode directly. For multiple modes, it‘s an Array of modes. postProc is a method reference. This method is called after the token has been detected. The method gets the type and the matching String and returns them again in an Array.

[Source]

     # File lib/TextScanner.rb, line 237
237:     def addPattern(type, regExp, mode, postProc = nil)
238:       if mode.is_a?(Array)
239:         mode.each do |m|
240:           # The pattern is active in multiple modes
241:           @patternsByMode[m] = [] unless @patternsByMode.include?(m)
242:           @patternsByMode[m] << [ type, regExp, postProc ]
243:         end
244:       else
245:         # The pattern is only active in one specific mode.
246:         @patternsByMode[mode] = [] unless @patternsByMode.include?(mode)
247:         @patternsByMode[mode] << [ type, regExp, postProc ]
248:       end
249:     end

Finish processing and reset all data structures.

[Source]

     # File lib/TextScanner.rb, line 279
279:     def close
280:       unless @fileNameIsBuffer
281:         Log.startProgressMeter("Reading file #{@masterFile}")
282:         Log.stopProgressMeter
283:       end
284:       @fileStack = []
285:       @cf = @tokenBuffer = nil
286:     end

Call this function to report any errors related to the parsed input.

[Source]

     # File lib/TextScanner.rb, line 476
476:     def error(id, text, sfi = nil, data = nil)
477:       message(:error, id, text, sfi, data)
478:     end

Expand a macro and inject it into the input stream. prefix is any string that was found right before the macro call. We have to inject it before the expanded macro. args is an Array of Strings. The first is the macro name, the rest are the parameters.

[Source]

     # File lib/TextScanner.rb, line 460
460:     def expandMacro(prefix, args)
461:       # Get the expanded macro from the @macroTable.
462:       macro, text = @macroTable.resolve(args, sourceFileInfo)
463:       unless macro && text
464:         error('undefined_macro', "Undefined macro '#{args[0]}' called")
465:       end
466: 
467:       # If the expanded macro is empty, we can ignore it.
468:       return if text == ''
469: 
470:       unless @cf.injectMacro(macro, args, prefix + text)
471:         error('macro_stack_overflow', "Too many nested macro calls.")
472:       end
473:     end

Return the name of the currently processed file. If we are working on a text buffer, the text will be returned.

[Source]

     # File lib/TextScanner.rb, line 336
336:     def fileName
337:       @cf ? @cf.fileName : @masterFile
338:     end

Continue processing with a new file specified by includeFileName. When this file is finished, we will continue in the old file after the location where we started with the new file. The method returns the full qualified name of the included file.

[Source]

     # File lib/TextScanner.rb, line 292
292:     def include(includeFileName, sfi, &block)
293:       if includeFileName[0] != '/'
294:         pathOfCallingFile = @fileStack.last[0].dirname
295:         path = pathOfCallingFile.empty? ? '' : pathOfCallingFile + '/'
296:         # If the included file is not an absolute name, we interpret the file
297:         # name relative to the including file.
298:         includeFileName = path + includeFileName
299:       end
300: 
301:       # Try to dectect recursive inclusions. This will not work if files are
302:       # accessed via filesystem links.
303:       @fileStack.each do |entry|
304:         if includeFileName == entry[0].fileName
305:           error('include_recursion',
306:                 "Recursive inclusion of #{includeFileName} detected", sfi)
307:         end
308:       end
309: 
310:       # Save @tokenBuffer in the record of the parent file.
311:       @fileStack.last[1] = @tokenBuffer unless @fileStack.empty?
312:       @tokenBuffer = nil
313:       @finishLastFile = false
314: 
315:       # Open the new file and push the handle on the @fileStack.
316:       begin
317:         @fileStack << [ (@cf = FileStreamHandle.new(includeFileName)),
318:                         nil, block ]
319:         Log << "Parsing file #{includeFileName}"
320:       rescue StandardError
321:         error('bad_include', "Cannot open include file #{includeFileName}", sfi)
322:       end
323: 
324:       # Return the name of the included file.
325:       includeFileName
326:     end

Return true if the Macro name has been added already.

[Source]

     # File lib/TextScanner.rb, line 452
452:     def macroDefined?(name)
453:       @macroTable.include?(name)
454:     end

Switch the parser to another mode. The scanner will then only detect patterns of that newMode.

[Source]

     # File lib/TextScanner.rb, line 253
253:     def mode=(newMode)
254:       #puts "**** New mode: #{newMode}"
255:       @activePatterns = @patternsByMode[newMode]
256:       raise "Undefined mode #{newMode}" unless @activePatterns
257:       @scannerMode = newMode
258:     end

Return the next token from the input stream. The result is an Array with 3 entries: the token type, the token String and the SourceFileInfo where the token started.

[Source]

     # File lib/TextScanner.rb, line 355
355:     def nextToken
356:       # If we have a pushed-back token, return that first.
357:       unless @tokenBuffer.nil?
358:         res = @tokenBuffer
359:         @tokenBuffer = nil
360:         return res
361:       end
362: 
363:       if @finishLastFile
364:         # The previously processed file has now really been processed to
365:         # completion. Close it and remove the corresponding entry from the
366:         # @fileStack.
367:         @finishLastFile = false
368:         #Log << "Completed file #{@cf.fileName}"
369: 
370:         # If we have a block to be executed on EOF, we call it now.
371:         onEof = @fileStack.last[2]
372:         onEof.call if onEof
373: 
374:         @cf.close if @cf
375:         @fileStack.pop
376: 
377:         if @fileStack.empty?
378:           # We are done with the top-level file now.
379:           @cf = @tokenBuffer = nil
380:           @finishLastFile = true
381:           return [ :endOfText, '<EOT>', @startOfToken ]
382:         else
383:           # Continue parsing the file that included the current file.
384:           @cf, tokenBuffer = @fileStack.last
385:           Log << "Parsing file #{@cf.fileName} ..."
386:           # If we have a left over token from previously processing this file,
387:           # return it now.
388:           if tokenBuffer
389:             @finishLastFile = true if tokenBuffer[0] == :eof
390:             return tokenBuffer
391:           end
392:         end
393:       end
394: 
395:       # Start processing characters from the input.
396:       @startOfToken = sourceFileInfo
397:       loop do
398:         match = nil
399:         begin
400:           @activePatterns.each do |type, re, postProc|
401:             if (match = @cf.scan(re))
402:               if match == :scannerEOF
403:                 # We've found the end of an input file. Return a special token
404:                 # that describes the end of a file.
405:                 @finishLastFile = true
406:                 return [ :eof, '<END>', @startOfToken ]
407:               end
408: 
409:               raise "#{re} matches empty string" if match.empty?
410:               # If we have a post processing method, call it now. It may modify
411:               # the type or the found token String.
412:               type, match = postProc.call(type, match) if postProc
413: 
414:               break if type.nil? # Ignore certain tokens with nil type.
415: 
416:               return [ type, match, @startOfToken ]
417:             end
418:           end
419:         rescue ArgumentError
420:           error('scan_encoding_error', $!.to_s)
421:         end
422: 
423:         if match.nil?
424:           if @cf.eof?
425:             error('unexpected_eof',
426:                   "Unexpected end of file found")
427:           else
428:             error('no_token_match',
429:                   "Unexpected characters found: '#{@cf.peek(10)}...'")
430:           end
431:         end
432:       end
433:     end

Start the processing. if fileNameIsBuffer is true, we operate on a String, else on a File.

[Source]

     # File lib/TextScanner.rb, line 263
263:     def open(fileNameIsBuffer = false)
264:       @fileNameIsBuffer = fileNameIsBuffer
265:       if fileNameIsBuffer
266:         @fileStack = [ [ @cf = BufferStreamHandle.new(@masterFile), nil, nil ] ]
267:       else
268:         begin
269:           @fileStack = [ [ @cf = FileStreamHandle.new(@masterFile), nil, nil ] ]
270:         rescue StandardError
271:           error('open_file', "Cannot open file #{@masterFile}")
272:         end
273:       end
274:       @masterPath = @cf.dirname + '/'
275:       @tokenBuffer = nil
276:     end

Return a token to retrieve it with the next nextToken() call again. Only 1 token can be returned before the next nextToken() call.

[Source]

     # File lib/TextScanner.rb, line 437
437:     def returnToken(token)
438:       #Log << "-> Returning Token: [#{token[0]}][#{token[1]}]"
439:       unless @tokenBuffer.nil?
440:         $stderr.puts @tokenBuffer
441:         raise "Fatal Error: Cannot return more than 1 token in a row"
442:       end
443:       @tokenBuffer = token
444:     end

Return SourceFileInfo for the current processing prosition.

[Source]

     # File lib/TextScanner.rb, line 329
329:     def sourceFileInfo
330:       @cf ? SourceFileInfo.new(fileName, @cf.lineNo - @lineDelta, 0) :
331:             SourceFileInfo.new(@masterFile, 0, 0)
332:     end

[Source]

     # File lib/TextScanner.rb, line 480
480:     def warning(id, text, sfi = nil, data = nil)
481:       message(:warning, id, text, sfi, data)
482:     end

[Validate]