New ScriptReader

kungfoomasta

06-06-2008 20:51:17

Here is my first draft of the new ScriptReader. I wrote this during break at work, so its written without any Ogre classes. The basic idea is this:

1. Read in all characters line by line
1a. For each line, convert into tokens, update token list
2. Parse Tokens and create ScriptDefinitions and DefinitionProperties.

A ScriptDefinition is anything in between a left and right brace, and has a specific type and id:

WidgetSkinType Button
{
}


A ScriptDefinition can contain a DefinitionProperty, which consists of a property name and 0 to many values:

uvcoords 0 0 1 1

ScriptDefinitions can contain ScriptDefinitions. The ScriptDefinition class lets you get sub-definition by <type,id>, or a group of sub-definitions by type, or all sub-definitions. You can also get properties by name, or all properties of the ScriptDefinition.

Here is the code for the ScriptReader:


void ScriptReader::parseScript(std::ifstream& stream, const std::string& groupName)
{
// Read in all data
char str[256];
while(stream.getline(str,256))
{
_convertToTokens(str);
}
mTokens.push_back(Token(QuickGUI::Token::TYPE_EOF,""));

_createDefinitions();
}

void ScriptReader::_convertToTokens(std::string s)
{
int index = 0;
while(index < static_cast<int>(s.length()))
{
if(s[index] == '{')
{
// merge the last two TYPE_TEXT tokens into a "definition" token
int lastTokenIndex = static_cast<int>(mTokens.size() - 1);
if((lastTokenIndex < 2) || (mTokens[lastTokenIndex].type != QuickGUI::Token::TYPE_NEWLINE) || (mTokens[lastTokenIndex].type != QuickGUI::Token::TYPE_NEWLINE))
{
std::cout << "Invalid syntax for Definition!" << std::endl;
return;
}

std::string type = mTokens[lastTokenIndex - 2].value;
std::string id = mTokens[lastTokenIndex - 1].value;
mTokens.pop_back(); // remove newline
mTokens.pop_back(); // remove text
mTokens.pop_back(); // remove text
mTokens.push_back(Token(QuickGUI::Token::TYPE_DEFINITION,type + " " + id)); // add definition
mTokens.push_back(Token(QuickGUI::Token::TYPE_NEWLINE,"\n")); // add newline
mTokens.push_back(Token(QuickGUI::Token::TYPE_OPENBRACE,"{"));

++index;
}
else if(s[index] == '}')
{
mTokens.push_back(Token(QuickGUI::Token::TYPE_CLOSEBRACE,"}"));
++index;
}
else if(!isspace(s[index]))
{
std::string text = "";
while((index < static_cast<int>(s.length())) && !isspace(s[index]) && ((s[index] != '\n') || (s[index] != '\r')))
{
text += s[index];
++index;
}

// Determine if this portion of text is a property name or a property value.
// A property name will have a newline in front of it.
int lastTokenIndex = static_cast<int>(mTokens.size() - 1);
if((lastTokenIndex > 0) && (mTokens[lastTokenIndex].type == QuickGUI::Token::TYPE_NEWLINE))
mTokens.push_back(Token(QuickGUI::Token::TYPE_PROPERTY,text));
else
mTokens.push_back(Token(QuickGUI::Token::TYPE_TEXT,text));
}
else
{
++index;
}
}

mTokens.push_back(Token(QuickGUI::Token::TYPE_NEWLINE,"\n"));
}

void ScriptReader::_createDefinitions()
{
int tokenIndex = 0;

ScriptDefinition* currentDefinition = NULL;

Token* currentToken = &(mTokens[tokenIndex]);

while(1)
{
switch(currentToken->type)
{
case QuickGUI::Token::TYPE_DEFINITION:
{
int index = static_cast<int>(currentToken->value.find_first_of(' '));
std::string type = currentToken->value.substr(0,index);
std::string id = currentToken->value.substr(index+1);
ScriptDefinition* newDefinition = new ScriptDefinition(type,id);

newDefinition->mParentDefinition = currentDefinition;

if(currentDefinition == NULL)
mDefinitions[type][id] = newDefinition;
else
currentDefinition->mSubDefinitions[type][id] = newDefinition;

currentDefinition = newDefinition;
}
break;
case QuickGUI::Token::TYPE_PROPERTY:
{
std::string propertyName = currentToken->value;
DefinitionProperty* newProperty = new DefinitionProperty(propertyName);

// Advance to next Token
++tokenIndex;
currentToken = &(mTokens[tokenIndex]);
while(currentToken->type == QuickGUI::Token::TYPE_TEXT)
{
newProperty->mValues.push_back(currentToken->value);

++tokenIndex;
currentToken = &(mTokens[tokenIndex]);
}

--tokenIndex;
currentToken = &(mTokens[tokenIndex]);

currentDefinition->mProperties[propertyName] = newProperty;
}
break;
case QuickGUI::Token::TYPE_CLOSEBRACE:
currentDefinition = currentDefinition->mParentDefinition;
break;
case QuickGUI::Token::TYPE_EOF:
return;
}

// Advance to next Token
++tokenIndex;
currentToken = &(mTokens[tokenIndex]);
}
}


The code has not been tested. If you see any optimizations or errors with the logic, please post them.

Squirell

07-06-2008 03:52:42

After a brief look the logic and overall structure seem to be good. I didn't bother analyzing it line by line because you haven't tested it yet though. And I wouldn't worry too much about optimizing it. It isn't realtime and it will probably be decently quick as it is.

Zini

07-06-2008 09:29:11

I see a few problems:


char str[256];
while(stream.getline(str,256))


DON'T EVER USE CHAR-ARRAYS FOR NON-CONSTANT STRINGS. Sorry for the caps, but that just had to be said. Its ugly and even though it is kinda safe this way, the 256 character restriction is totally arbitrary. Use a std::string instead.

The handling of whitespaces is problematic too: First you should check for tabs too.On the other hand there is no point in checking for stuff like newlines. If you read the source line by line, you won't get them anyway and if you would get one you certainly shouldn't ignore it. So just drop that part.

And in the createDefinitions part you should do more error checking (e.g. a '}' at the wrong time can lead to a crash with your current code).

You maybe should not call s.length() all over the code. There is no guarantee, that this value is cached by the string class.I am not promoting premature optimization here, but this kind easily end up with O(n*n).

kungfoomasta

08-06-2008 01:44:56

Thanks for the feedback all. I've made a few changes now that it's Ogre-ified.


void ScriptReader::parseScript(Ogre::DataStreamPtr &stream, const Ogre::String &groupName)
{
// Read in all data
while(!stream->eof())
{
_convertToTokens(stream->getLine(),mTokens);
}
mTokens.push_back(Token(QuickGUI::Token::TYPE_EOF,""));

_createDefinitions(mTokens,mDefinitions);

// We no longer need the tokens, get rid of them. (There are probably quite a few!)
mTokens.clear();
}


Good call about checking newlines, I realized that also and yanked it out. :oops:

Zini, can you elaborate more about the use of length? I see a few places its used, but is there a better way? Which function are you referring to?

Here are the remaining two:


void ScriptReader::_convertToTokens(Ogre::String s, std::vector<Token>& tokenList)
{
int index = 0;
while(index < static_cast<int>(s.length()))
{
if(s[index] == '{')
{
// merge the last two TYPE_TEXT tokens into a "definition" token
int lastTokenIndex = static_cast<int>(tokenList.size() - 1);
if((lastTokenIndex < 2) || (tokenList[lastTokenIndex].type != QuickGUI::Token::TYPE_NEWLINE) || (tokenList[lastTokenIndex].type != QuickGUI::Token::TYPE_NEWLINE))
{
std::cout << "Invalid syntax for Definition!" << std::endl;
return;
}

Ogre::String type = tokenList[lastTokenIndex - 2].value;
Ogre::String id = tokenList[lastTokenIndex - 1].value;
tokenList.pop_back(); // remove newline
tokenList.pop_back(); // remove text
tokenList.pop_back(); // remove text
tokenList.push_back(Token(QuickGUI::Token::TYPE_DEFINITION,type + " " + id)); // add definition
tokenList.push_back(Token(QuickGUI::Token::TYPE_NEWLINE,"\n")); // add newline
tokenList.push_back(Token(QuickGUI::Token::TYPE_OPENBRACE,"{"));

++index;
}
else if(s[index] == '}')
{
tokenList.push_back(Token(QuickGUI::Token::TYPE_CLOSEBRACE,"}"));
++index;
}
else if(!isspace(s[index]))
{
Ogre::String text = "";
while((index < static_cast<int>(s.length())) && !isspace(s[index]) && ((s[index] != '\n') || (s[index] != '\r')))
{
text += s[index];
++index;
}

// Determine if this portion of text is a property name or a property value.
// A property name will have a newline in front of it.
int lastTokenIndex = static_cast<int>(tokenList.size() - 1);
if((lastTokenIndex > 0) && (tokenList[lastTokenIndex].type == QuickGUI::Token::TYPE_NEWLINE))
tokenList.push_back(Token(QuickGUI::Token::TYPE_PROPERTY,text));
else
tokenList.push_back(Token(QuickGUI::Token::TYPE_TEXT,text));
}
else
{
++index;
}
}

tokenList.push_back(Token(QuickGUI::Token::TYPE_NEWLINE,"\n"));
}

void ScriptReader::_createDefinitions(std::vector<Token>& tokenList, std::map<Ogre::String, std::map<Ogre::String,ScriptDefinition*> >& defList)
{
int tokenIndex = 0;

ScriptDefinition* currentDefinition = NULL;

Token* currentToken = &(tokenList[tokenIndex]);

while(1)
{
switch(currentToken->type)
{
case QuickGUI::Token::TYPE_DEFINITION:
{
int index = static_cast<int>(currentToken->value.find_first_of(' '));
Ogre::String type = currentToken->value.substr(0,index);
Ogre::String id = currentToken->value.substr(index+1);
ScriptDefinition* newDefinition = new ScriptDefinition(type,id);

newDefinition->mParentDefinition = currentDefinition;

if(currentDefinition == NULL)
defList[type][id] = newDefinition;
else
currentDefinition->mSubDefinitions[type][id] = newDefinition;

currentDefinition = newDefinition;
}
break;
case QuickGUI::Token::TYPE_PROPERTY:
{
Ogre::String propertyName = currentToken->value;
DefinitionProperty* newProperty = new DefinitionProperty(propertyName);

// Advance to next Token
++tokenIndex;
currentToken = &(tokenList[tokenIndex]);
while(currentToken->type == QuickGUI::Token::TYPE_TEXT)
{
newProperty->mValues.push_back(currentToken->value);

++tokenIndex;
currentToken = &(tokenList[tokenIndex]);
}

--tokenIndex;
currentToken = &(tokenList[tokenIndex]);

currentDefinition->mProperties[propertyName] = newProperty;
}
break;
case QuickGUI::Token::TYPE_CLOSEBRACE:
currentDefinition = currentDefinition->mParentDefinition;
break;
case QuickGUI::Token::TYPE_EOF:
return;
}

// Advance to next Token
++tokenIndex;
currentToken = &(tokenList[tokenIndex]);
}
}


I made the functions take the lists to populate, since I added the ability for the parser to parse a file and return a list of definitions. 8)

I will need to beef it up a little at some point. Aside from that, I'm really pleased with the progress so far.

Zini

08-06-2008 08:25:22


Zini, can you elaborate more about the use of length? I see a few places its used, but is there a better way? Which function are you referring to?


std::size_t length = s.length();

Place this line outside of any loop and use the variable length instead of the length function inside the loops. It may help or may not help (depending on implementation details of the compiler and the string-class). But if you don't do it and have bad luck, you can get really bad performance.

Edit: Oops! Somehow I had in mind, that the string would hold the entire file. But since its only a single line, you can forget about my advice regarding length.

kungfoomasta

09-06-2008 04:33:51

Almost done updating the serialization system. I've created the ScriptReader, ScriptWriter, and updated my SerialWriter and SerialReader class.

Here is what the old serialization looked like:


Sheet DefaultSheet.DefaultSheet
{
DescClass SheetDesc
ConsumeKeyboardEvents false
Enabled true
Dimensions 0 0 800 600
Dragable false
HorizontalAnchor ANCHOR_HORIZONTAL_LEFT
HoverTime 3
MaxSize 0 0
MinSize 0 0
Name DefaultSheet
Scrollable true
Type transparent
VerticalAnchor ANCHOR_VERTICAL_TOP
Visible true

Child0 Panel DefaultSheet.MainPanel
Child1 Button DefaultSheet.TestButton
Child2 ToolBar DefaultSheet.TestToolBar
Child3 TextBox DefaultSheet.TestTextBox
Child4 TextBox DefaultSheet.TestTextBox2
Child5 TextArea DefaultSheet.TestArea1

Window0 DefaultSheet.Window1
}

Window DefaultSheet.Window1
{
DescClass WindowDesc
ConsumeKeyboardEvents false
Enabled true
Dimensions 200 200 200 200
Dragable false
HorizontalAnchor ANCHOR_HORIZONTAL_LEFT
HoverTime 3
MaxSize 0 0
MinSize 0 0
Name Window1
Scrollable true
Type transparent
VerticalAnchor ANCHOR_VERTICAL_TOP
Visible true
TitleBar true
TitleBarCloseButton true
TitleBarType qgui

Child0 Panel DefaultSheet.Window1Panel
}


And here is what the new serialization looks like:


Sheet DefaultSheet
{
ConsumeKeyboardEvents false
Dimensions 0 0 800 600
Dragable false
Enabled true
HorizontalAnchor ANCHOR_HORIZONTAL_LEFT
HoverTime 3
MaxSize 0 0
MinSize 0 0
Name DefaultSheet
Scrollable true
Type transparent
VerticalAnchor ANCHOR_VERTICAL_TOP
Visible true

Child Windows
{
Window Window1
{
ConsumeKeyboardEvents false
Dimensions 200 200 200 200
Dragable false
Enabled true
HorizontalAnchor ANCHOR_HORIZONTAL_LEFT
HoverTime 3
MaxSize 0 0
MinSize 0 0
Name Window1
Scrollable true
TitleBar true
TitleBarCloseButton true
TitleBarType qgui
Type transparent
VerticalAnchor ANCHOR_VERTICAL_TOP
Visible true

Child widgets
{
Panel Window1Panel
{
ConsumeKeyboardEvents false
Dimensions 50 50 100 100
Dragable false
Enabled true
HorizontalAnchor ANCHOR_HORIZONTAL_LEFT
HoverTime 3
MaxSize 0 0
MinSize 0 0
Name Window1Panel
Scrollable true
Type transparent
VerticalAnchor ANCHOR_VERTICAL_TOP
Visible true

Child widgets
{
}

}

}

}

}

Child widgets
{
Button TestButton
{
...


I will probably polish up the formatting so its easy to distinguish the property names from the property values (like in the first example). Its nice being able to use nested braces. :)