2. Types and Variables

Type is an attribute of data, which defines the meaning of the data and the operations that can be performed on the data. Types can be divided into built-in types and user-defined types. Built-in types refer to some basic types built into the Berry language, among which types that are not based on class definitions are called Simple type. Types based on class definitions are called Class type, some of the built-in types are class types, and user-defined types are also class types.

2.1 Built-in type

2.1.1 Simple Type

nil

The Nil type is the null type, which means that the object has an invalid value, or it can be said that the object has no meaningful value. This is a very special type. Although we might say that a variable is nil, in fact the nil type has no value, so what we are talking about here is that the type of the variable is nil (not a value).

The default value of a variable before assignment is nil. This type can be used in logic operations. In logic operations, nil is equivalent to false.

Integer type

The integer type (integer) represents a signed integer, referred to as an integer. The number of bits of the integer represented by this type depends on the specific implementation, and usually consists of a 32-bit signed integer on a 32-bit platform. Integer is an arithmetic type and supports all arithmetic operations. Pay attention to the value range of integers when using this type. The typical value range of 32-bit signed integers is between  −2147483648 and 2147483647.

Any value can be converted to int using the int() function; however int(nil) == nil. If the argument is an instance and if it contains a member toint() it we be called and the return value converted to int.

Real Number Type

The real type (real), to be precise, is a floating-point type. Real number types are usually implemented as single-precision floating-point numbers or double-precision floating-point numbers. The real number type is also an arithmetic type. Compared with the integer type, the real number type has higher precision and a larger value range, so this type is more suitable for mathematical calculations. It should be noted that the real number type is actually a floating point number, so there are still precision problems. For example, it is not recommended to compare two values of type real for equality.

When integers and real numbers participate in operations at the same time, the integers are usually converted to real numbers.

Boolean type

The Boolean type (boolean) is used for logical operations. It has two values true and false, which represent the two true values (true and false) in logic and Boolean algebra. The Boolean type is mainly used for conditional judgment. The operands and return values of logical expressions and relational expressions are all boolean types, and statements such as if and while all use boolean types as conditional checks.

In many cases, non-boolean values can also be used as boolean types. This is because the interpreter will implicitly convert the parameters. This is also the reason that conditional check expressions such as if statements can use any type of parameters. The rules for converting various types to Boolean types are:

  • nil: converted to false.

  • Integer: when the value is 0, it is converted to false, otherwise it is converted to true.

  • Real number: when the value is 0.0, it is converted to false, otherwise it is converted to true.

  • String: when the value is “” (empty string) it is concerted to false otherwise it is converted to true.

  • Comobj and Comptr: when the internal pointer is NULL it is converted to false, otherwise it is converted to true.

  • Instance: if the instance contains a method tobool(), the return value of the method will be used, otherwise it will be converted to true.

  • All other types: convert to true.

Any value can be converted to bool using bool() function.

String

A string is a sequence of characters. In terms of storage, Berry divides strings into long strings and short strings. There is only one instance of the same short character string in memory, and all short character strings are linked in a hash table. This design helps to improve the performance of string equality comparison and can reduce memory usage. Since the use frequency of long strings is low, and the overhead of hash operation is quite high, they are not linked to the hash table, so there may be multiple identical instances in the memory. The string is read-only after it is created. Therefore, “modifying” the string will generate a new string, and the original string will not be modified.

Berry does not care about the format or encoding of characters. For example, the string ’abc’ is actually the ASCII code of the characters ’a’, ’b’ and ’c’. Therefore, if there are wide characters in the string (the character length is greater than 1 byte), the number of characters in the string cannot be directly counted. In fact, using the size() function can only get the number of bytes in the string. In addition, in order to facilitate interaction with the C language, Berry’s string always ends with ’\0’ characters. This feature is transparent to the Berry program.

The string type can be compared in size, so it can be used in relational operations.

Function

A function is a piece of code that is encapsulated and available for call, generally used to implement a specific function. Function is actually a big category, which includes several subtypes such as closures, native functions, and native closures. For Berry code, all function subtypes have the same behavior. Functions belong to the first type of value in Berry, so they can be passed as values. In addition, it can be directly used in expressions through the “literal” form of “anonymous functions”.

A function is a read-only object and cannot be modified once defined. You can compare whether two functions are equal (whether they are the same function), but the function type cannot be compared.Native function (native function) and Native closure (native closure) refer to functions and closures implemented in C language. One of the main purposes of native functions and native closures is to provide functions that the Berry language does not provide, such as IO operations and low-level operations. If a piece of code is used frequently and has performance requirements, it is also recommended to rewrite it as a native function or native closure.

Class

In object-oriented programming, a class is an extensible program code template. Classes are used to create instance objects, so the class can be said to be the “type” of the instance. All instance objects are of type instance and they all have a corresponding class, which is called instance Class type. To put it simply, a class is a value representing the type of an instance object, and a class is an abstraction of the characteristics of an instance. A class is also a read-only object, once defined, it cannot be modified.

Classes can only compare equals and unequals, but cannot compare sizes.

Examples

An instance is a materialized object generated by a class, and the process of generating an instance from a class is called Instantiate. In object-oriented programming, “instance” is usually synonymous with “object”. However, in order to distinguish from non-instance objects, we do not use the term “object” alone, but use “instance” or “instance object”. Berry instances are always allocated dynamically and need to be used with a garbage collector. In addition to memory allocation, the process of instantiation also needs to initialize the instance, this process is completed by Constructor. In addition, you can complete the destruction of the object through Destructor before reclaiming the object’s memory.

In the internal implementation, the instance will contain a reference to the class, and the instance itself only stores member variables and not methods.

2.1.2 Class Type

Some of the built-in types are class types, they are list, map and range. Unlike custom types, built-in class types can be constructed using literals, for example [1, 2, 3] is a literal of type list.

List

The List class is a container that provides support for list data types. Berry’s list is an ordered collection of elements, and each element in the list has a unique integer index, and each element can be accessed directly according to the index. List supports inserting or deleting elements at any position, and the element can be of any type. In addition to using indexes, you can also use iterators to access elements in the list.

The implementation of List is a dynamic array, and this data structure has good random access performance. The efficiency of adding and deleting elements at the end of the list is very high, but the efficiency of adding and deleting elements in the middle of the list is low.

The literal initialization method of the List container is to use a list of objects surrounded by square brackets, and multiple objects are separated by commas, for example:

[]
['string']
[0, 1, 2,'list']

Operations: see chapter 7.

Map

Map is also a kind of container, map is a collection of key-value pairs, and each possible key appears at most once in the collection. The Map container provides the following basic operations:

  • Add key-value pairs to the collection

  • Remove key-value pairs from the collection

  • Modify the value corresponding to an existing key

  • Find the corresponding value by key

Map is implemented using a hash table and has high search efficiency. The operation of adding and deleting key-value pairs will consume more time if “re-hashing” occurs.

The Map container can also be initialized using literal values, written in curly braces to enclose a list of key-value pairs, separated by colons between keys and values, and separated by commas between key-value pairs. E.g:

{}
{'str':'hello'}
{'str':'hello','int': 45, 78: nil}

Operations: see chapter 7.

Range

The Range container represents an integer range, which is usually used to iterate in an integer range. This type has a __lower__ member and __upper__ member, which represent the lower and upper bounds of the range, respectively. The literal value of Range is a pair of integers connected using the .. operator:

0 .. 10
-5 .. 5

When the Range class is used for iteration, the elements of the iteration are all integer values from the lower bound to the upper bound, including boundary values. For example, the iteration result of 0..5 is:

0 1 2 3 4 5

Therefore, it should be noted that for a range of x .. (x+n), the number of iterations is n + 1. A common construct to iterate through elements of a list by item is:

for i: 0..size(l)-1

Open range: if you omit the last range, it is implicitly replaced with MAXINT.

> r = 10..
> r
(10..9223372036854775807)

Bytes

Bytes object denote a bytes buffer which can be used to manipulate bytes buffers or to read/write some C memory areas or structures.

See Chapter 7.

2.2 Variables

A variable is a storage space with a name, and the data or information stored in the storage space is called the value of the variable. Variable names are used to refer to variables in source code. In different scopes, a variable name can bind multiple independent variables, but variables have no aliases. The value of the variable can be accessed or changed at any time during the running of the program. Berry is a dynamically typed language, so the type of variable value is determined at runtime, and the variable can store any type of value.

2.2.1 define variables

The first way to define a variable is to use an assignment statement to assign a value to a new variable name:

’var’ = expression

variable is the name of the variable, and the variable name is an identifier (see section identifier). expression is the expression to initialize the variable.

a = 1
b ='str'

However, this method of defining variables has some limitations. Take the following code as an example:

i = 0
do
    i = 1
    print(i) # 1
end
print(i) # 1

The do statement in the routine constitutes the inner scope. We modified the value of the variable i at line 3, and the value of i is still 1 after leaving the inner scope at line 6 . If we want the variable i of the inner scope to be an independent variable, the method of defining the variable by directly assigning to the new variable name cannot meet the requirement, because the identifier i already exists in the outer scope. In this case, the variable can be defined by the var keyword:

’var’ variable
’var’ variable = expression

There are two ways of using var to define a variable: The first is to follow the variable name variable after the keyword var, in this case the variable will be initialized to nil, and the other is written in The variable is initialized at the same time as the variable is defined. In this case, an initial value expression expression is required. Using var to define a variable has two possible results: if the current scope does not define the variable of variable, define and initialize the variable, otherwise it is equivalent to reinitialize the variable. Therefore, the variable defined with var will shield the variable with the same name in the outer scope.

Now we change the previous example to use the var keyword to define variables:

i = 0
do
    var i = 1
    print(i) # 1
end
print(i) # 0

From the modified routine, it can be found that the value of the variable i in the inner scope is 1, and its value in the outer scope is 0. This proves that after using the var keyword, a new variable i is defined in the inner scope and the variable with the same name in the outer scope is blocked. After the inner scope ends, the identifier i is once again bound to the variable i in the outer scope.

When using the var keyword to define a variable, you can also use a list of multiple variable names, separated by commas. You can also initialize one or more variables when defining variables:

var a = 0, b, c ='test'

2.2.2 Scope and Life Cycle

As mentioned earlier, variable names can be bound to multiple variable entities (storage spaces), and variable names are bound to only one entity at each position. The entity bound by the variable name needs to be determined according to the position where the variable name appears.

Scope refers to the code area where the name and the entity are uniquely bound. Outside the scope, the name may be bound to other entities, or not bound to any entity. The entity is only visible in the scope bound to the name, that is, the variable is only valid in its scope.A code block (see block) is a scope. A variable is only available inside the block, and names in different blocks may bind different variable entities. The following example demonstrates the scope of variables:

var i = 0
do
    var j ='str'
    print(i, j) # 0 str
end
# The variable j is not available here
print(i) # 0

The names i and j are defined in this routine. The name i is defined outside the do sentence, and the name defined in the outermost block has Global scope (global scope). The name with global scope is available in the entire program after customization. The name j is defined in the block in the do sentence, and the name of this type of definition in the non-outermost block has Local scope (local scope). A name with a local scope cannot be accessed outside the scope.

Berry has some built-in objects, which are all in the global scope. However, built-in objects and global variables defined in scripts are not in the same global scope. Built-in objects actually belong to Built-in scope (built-in scope). The scope is globally visible as the ordinary global scope, but can be covered by the ordinary global scope. Built-in objects include functions and classes in the standard library. These objects include print functions, type functions, and map classes. Different from other scopes, the variables in the built-in scope are read-only, so “assignment” to the variables in the built-in scope actually defines a variable with the same name in the global scope, which overrides The symbols in the built-in scope.

nested scope

Nested scope means that the scope contains another scope. We call the contained scope Inner scope, and the scope that contains the inner scope Outer scope. The name defined in the outer scope can be accessed in all inner scopes. The inner scope can also rebind the name already defined in the outer scope. The previous example using var to define variables describes this scenario.

Variable Life Cycle

There is no concept of variable names when the program is running, and variables exist in the form of entities at this time. The “validity period” of a variable during program execution is the variable’s Life cycle. Variables at runtime are only valid within the scope. After leaving the scope, the variables will be destroyed to reclaim resources.

Variables defined in the global scope are called Global variable and have Static life cycle. Such variables can be accessed during the entire program running and will not be destroyed. Variables defined in the local scope are called Local variable and have Dynamic life cycle. Such variables cannot be accessed after leaving the scope and will be destroyed.

Due to the different life cycles, local variables and global variables use different ways to allocate storage space. Local variables are allocated on a structure called Stack (stack), and objects allocated based on the stack can be quickly reclaimed at the end of the scope. Global variables are allocated in Global table (global table). Objects in the global table will not be recycled once they are created, and the table can be accessed anywhere in the program.

2.2.3 Type of variable

Berry determines the type of the variable at runtime. In other words, the variable can store any type of value. Therefore Berry is a Dynamic typing language. The interpreter does not deduce the type of the variable at compile time, which may cause some errors to be exposed at runtime. For example, the error generated by executing the expression ’1’ + 1 is a runtime error rather than a compiler error. The advantage of using dynamic types is that many designs can be simplified, and the program will be more flexible, not to mention the need to design a complex type inference system.

Due to the lack of type checking by the interpreter, user code may need to determine the type of value by itself, and this feature can also be used to implement some special operations. This feature also makes overloaded functions unnecessary. For example, the native function type accepts any type of parameter and returns a string describing the parameter type.