2. Types and Variables
======================

**Type** is an attribute of data, which defines the meaning of the data
and the operations that can be performed on the data. Types can be
divided into built-in types and user-defined types. Built-in types refer
to some basic types built into the Berry language, among which types
that are not based on class definitions are called **Simple type**.
Types based on class definitions are called **Class type**, some of the
built-in types are class types, and user-defined types are also class
types.

2.1 Built-in type
-----------------

2.1.1 Simple Type
~~~~~~~~~~~~~~~~~

``nil``
^^^^^^^

The Nil type is the null type, which means that the object has an
invalid value, or it can be said that the object has no meaningful
value. This is a very special type. Although we might say that a
variable is ``nil``, in fact the nil type has no value, so what we are
talking about here is that the type of the variable is nil (not a
value).

The default value of a variable before assignment is ``nil``. This type
can be used in logic operations. In logic operations, ``nil`` is
equivalent to ``false``.

Integer type
^^^^^^^^^^^^

The integer type (integer) represents a signed integer, referred to as
an integer. The number of bits of the integer represented by this type
depends on the specific implementation, and usually consists of a 32-bit
signed integer on a 32-bit platform. Integer is an arithmetic type and
supports all arithmetic operations. Pay attention to the value range of
integers when using this type. The typical value range of 32-bit signed
integers is between  −2147483648 and 2147483647.

Any value can be converted to ``int`` using the ``int()`` function;
however ``int(nil) == nil``. If the argument is an instance and if it
contains a member ``toint()`` it we be called and the return value
converted to ``int``.

Real Number Type
^^^^^^^^^^^^^^^^

The real type (real), to be precise, is a floating-point type. Real
number types are usually implemented as single-precision floating-point
numbers or double-precision floating-point numbers. The real number type
is also an arithmetic type. Compared with the integer type, the real
number type has higher precision and a larger value range, so this type
is more suitable for mathematical calculations. It should be noted that
the real number type is actually a floating point number, so there are
still precision problems. For example, it is not recommended to compare
two values of type ``real`` for equality.

When integers and real numbers participate in operations at the same
time, the integers are usually converted to real numbers.

Boolean type
^^^^^^^^^^^^

The Boolean type (boolean) is used for logical operations. It has two
values ``true`` and ``false``, which represent the two true values (true
and false) in logic and Boolean algebra. The Boolean type is mainly used
for conditional judgment. The operands and return values of logical
expressions and relational expressions are all boolean types, and
statements such as ``if`` and ``while`` all use boolean types as
conditional checks.

In many cases, non-boolean values can also be used as boolean types.
This is because the interpreter will implicitly convert the parameters.
This is also the reason that conditional check expressions such as
``if`` statements can use any type of parameters. The rules for
converting various types to Boolean types are:

-  ``nil``: converted to ``false``.

-  **Integer**: when the value is ``0``, it is converted to ``false``,
   otherwise it is converted to ``true``.

-  **Real number**: when the value is ``0.0``, it is converted to
   ``false``, otherwise it is converted to ``true``.

-  **String**: when the value is "" (empty string) it is concerted to
   ``false`` otherwise it is converted to ``true``.

-  **Comobj** and **Comptr**: when the internal pointer is ``NULL`` it
   is converted to ``false``, otherwise it is converted to ``true``.

-  **Instance**: if the instance contains a method ``tobool()``, the
   return value of the method will be used, otherwise it will be
   converted to ``true``.

-  All other types: convert to ``true``.

Any value can be converted to ``bool`` using ``bool()`` function.

String
^^^^^^

A string is a sequence of characters. In terms of storage, Berry divides
strings into long strings and short strings. There is only one instance
of the same short character string in memory, and all short character
strings are linked in a hash table. This design helps to improve the
performance of string equality comparison and can reduce memory usage.
Since the use frequency of long strings is low, and the overhead of hash
operation is quite high, they are not linked to the hash table, so there
may be multiple identical instances in the memory. The string is
read-only after it is created. Therefore, “modifying” the string will
generate a new string, and the original string will not be modified.

Berry does not care about the format or encoding of characters. For
example, the string ``’abc’`` is actually the ASCII code of the
characters ``’a’``, ``’b’`` and ``’c’``. Therefore, if there are wide
characters in the string (the character length is greater than 1 byte),
the number of characters in the string cannot be directly counted. In
fact, using the ``size()`` function can only get the number of bytes in
the string. In addition, in order to facilitate interaction with the C
language, Berry’s string always ends with ``’\0’`` characters. This
feature is transparent to the Berry program.

The string type can be compared in size, so it can be used in relational
operations.

Function
^^^^^^^^

A function is a piece of code that is encapsulated and available for
call, generally used to implement a specific function. Function is
actually a big category, which includes several subtypes such as
closures, native functions, and native closures. For Berry code, all
function subtypes have the same behavior. Functions belong to the first
type of value in Berry, so they can be passed as values. In addition, it
can be directly used in expressions through the “literal” form of
“anonymous functions”.

A function is a read-only object and cannot be modified once defined.
You can compare whether two functions are equal (whether they are the
same function), but the function type cannot be compared.\ **Native
function** (native function) and **Native closure** (native closure)
refer to functions and closures implemented in C language. One of the
main purposes of native functions and native closures is to provide
functions that the Berry language does not provide, such as IO
operations and low-level operations. If a piece of code is used
frequently and has performance requirements, it is also recommended to
rewrite it as a native function or native closure.

Class
^^^^^

In object-oriented programming, a class is an extensible program code
template. Classes are used to create instance objects, so the class can
be said to be the “type” of the instance. All instance objects are of
type ``instance`` and they all have a corresponding class, which is
called instance **Class type**. To put it simply, a class is a value
representing the type of an instance object, and a class is an
abstraction of the characteristics of an instance. A class is also a
read-only object, once defined, it cannot be modified.

Classes can only compare equals and unequals, but cannot compare sizes.

Examples
^^^^^^^^

An instance is a materialized object generated by a class, and the
process of generating an instance from a class is called
``Instantiate``. In object-oriented programming, “instance” is usually
synonymous with “object”. However, in order to distinguish from
non-instance objects, we do not use the term “object” alone, but use
“instance” or “instance object”. Berry instances are always allocated
dynamically and need to be used with a garbage collector. In addition to
memory allocation, the process of instantiation also needs to initialize
the instance, this process is completed by ``Constructor``. In addition,
you can complete the destruction of the object through ``Destructor``
before reclaiming the object’s memory.

In the internal implementation, the instance will contain a reference to
the class, and the instance itself only stores member variables and not
methods.

2.1.2 Class Type
~~~~~~~~~~~~~~~~

Some of the built-in types are class types, they are ``list``, ``map``
and ``range``. Unlike custom types, built-in class types can be
constructed using literals, for example ``[1, 2, 3]`` is a literal of
type ``list``.

List
^^^^

The List class is a container that provides support for list data types.
Berry’s list is an ordered collection of elements, and each element in
the list has a unique integer index, and each element can be accessed
directly according to the index. List supports inserting or deleting
elements at any position, and the element can be of any type. In
addition to using indexes, you can also use iterators to access elements
in the list.

The implementation of List is a dynamic array, and this data structure
has good random access performance. The efficiency of adding and
deleting elements at the end of the list is very high, but the
efficiency of adding and deleting elements in the middle of the list is
low.

The literal initialization method of the List container is to use a list
of objects surrounded by square brackets, and multiple objects are
separated by commas, for example:

.. code:: berry

   []
   ['string']
   [0, 1, 2,'list']

Operations: see chapter 7.

Map
^^^

Map is also a kind of container, map is a collection of key-value pairs,
and each possible key appears at most once in the collection. The Map
container provides the following basic operations:

-  Add key-value pairs to the collection

-  Remove key-value pairs from the collection

-  Modify the value corresponding to an existing key

-  Find the corresponding value by key

Map is implemented using a hash table and has high search efficiency.
The operation of adding and deleting key-value pairs will consume more
time if “re-hashing” occurs.

The Map container can also be initialized using literal values, written
in curly braces to enclose a list of key-value pairs, separated by
colons between keys and values, and separated by commas between
key-value pairs. E.g:

.. code:: berry

   {}
   {'str':'hello'}
   {'str':'hello','int': 45, 78: nil}

Operations: see chapter 7.

Range
^^^^^

The Range container represents an integer range, which is usually used
to iterate in an integer range. This type has a ``__lower__`` member and
``__upper__`` member, which represent the lower and upper bounds of the
range, respectively. The literal value of Range is a pair of integers
connected using the ``..`` operator:

.. code:: berry

   0 .. 10
   -5 .. 5

When the Range class is used for iteration, the elements of the
iteration are all integer values from the lower bound to the upper
bound, including boundary values. For example, the iteration result of
``0..5`` is:

.. code:: berry

   0 1 2 3 4 5

Therefore, it should be noted that for a range of *x* .. (*x*\ +\ *n*),
the number of iterations is *n* + 1. A common construct to iterate
through elements of a list by item is:

.. code:: berry

   for i: 0..size(l)-1

Open range: if you omit the last range, it is implicitly replaced with
MAXINT.

.. code:: berry

   > r = 10..
   > r
   (10..9223372036854775807)

Bytes
^^^^^

Bytes object denote a bytes buffer which can be used to manipulate bytes
buffers or to read/write some C memory areas or structures.

See Chapter 7.

2.2 Variables
-------------

A variable is a storage space with a name, and the data or information
stored in the storage space is called the value of the variable.
Variable names are used to refer to variables in source code. In
different scopes, a variable name can bind multiple independent
variables, but variables have no aliases. The value of the variable can
be accessed or changed at any time during the running of the program.
Berry is a dynamically typed language, so the type of variable value is
determined at runtime, and the variable can store any type of value.

2.2.1 define variables
~~~~~~~~~~~~~~~~~~~~~~

The first way to define a variable is to use an assignment statement to
assign a value to a new variable name:

.. code::

   ’var’ = expression

**variable** is the name of the variable, and the variable name is an
identifier (see section identifier). **expression** is the expression to
initialize the variable.

.. code:: berry

   a = 1
   b ='str'

However, this method of defining variables has some limitations. Take
the following code as an example:

.. code:: berry

   i = 0
   do
       i = 1
       print(i) # 1
   end
   print(i) # 1

The ``do`` statement in the routine constitutes the inner scope. We
modified the value of the variable ``i`` at line 3, and the value of
``i`` is still ``1`` after leaving the inner scope at line 6 . If we
want the variable ``i`` of the inner scope to be an independent
variable, the method of defining the variable by directly assigning to
the new variable name cannot meet the requirement, because the
identifier ``i`` already exists in the outer scope. In this case, the
variable can be defined by the ``var`` keyword:

.. code::

   ’var’ variable
   ’var’ variable = expression

There are two ways of using ``var`` to define a variable: The first is
to follow the variable name **variable** after the keyword ``var``, in
this case the variable will be initialized to ``nil``, and the other is
written in The variable is initialized at the same time as the variable
is defined. In this case, an initial value expression **expression** is
required. Using ``var`` to define a variable has two possible results:
if the current scope does not define the variable of **variable**,
define and initialize the variable, otherwise it is equivalent to
reinitialize the variable. Therefore, the variable defined with ``var``
will shield the variable with the same name in the outer scope.

Now we change the previous example to use the ``var`` keyword to define
variables:

.. code:: berry

   i = 0
   do
       var i = 1
       print(i) # 1
   end
   print(i) # 0

From the modified routine, it can be found that the value of the
variable ``i`` in the inner scope is ``1``, and its value in the outer
scope is ``0``. This proves that after using the ``var`` keyword, a new
variable ``i`` is defined in the inner scope and the variable with the
same name in the outer scope is blocked. After the inner scope ends, the
identifier ``i`` is once again bound to the variable ``i`` in the outer
scope.

When using the ``var`` keyword to define a variable, you can also use a
list of multiple variable names, separated by commas. You can also
initialize one or more variables when defining variables:

.. code:: berry

   var a = 0, b, c ='test'

2.2.2 Scope and Life Cycle
~~~~~~~~~~~~~~~~~~~~~~~~~~

As mentioned earlier, variable names can be bound to multiple variable
entities (storage spaces), and variable names are bound to only one
entity at each position. The entity bound by the variable name needs to
be determined according to the position where the variable name appears.

**Scope** refers to the code area where the name and the entity are
uniquely bound. Outside the scope, the name may be bound to other
entities, or not bound to any entity. The entity is only visible in the
scope bound to the name, that is, the variable is only valid in its
scope.A code block (see block) is a scope. A variable is only available
inside the block, and names in different blocks may bind different
variable entities. The following example demonstrates the scope of
variables:

.. code:: berry

   var i = 0
   do
       var j ='str'
       print(i, j) # 0 str
   end
   # The variable j is not available here
   print(i) # 0

The names ``i`` and ``j`` are defined in this routine. The name ``i`` is
defined outside the ``do`` sentence, and the name defined in the
outermost block has **Global scope** (global scope). The name with
global scope is available in the entire program after customization. The
name ``j`` is defined in the block in the ``do`` sentence, and the name
of this type of definition in the non-outermost block has **Local
scope** (local scope). A name with a local scope cannot be accessed
outside the scope.

Berry has some built-in objects, which are all in the global scope.
However, built-in objects and global variables defined in scripts are
not in the same global scope. Built-in objects actually belong to
**Built-in scope** (built-in scope). The scope is globally visible as
the ordinary global scope, but can be covered by the ordinary global
scope. Built-in objects include functions and classes in the standard
library. These objects include ``print`` functions, ``type`` functions,
and ``map`` classes. Different from other scopes, the variables in the
built-in scope are read-only, so “assignment” to the variables in the
built-in scope actually defines a variable with the same name in the
global scope, which overrides The symbols in the built-in scope.

nested scope
^^^^^^^^^^^^

Nested scope means that the scope contains another scope. We call the
contained scope **Inner scope**, and the scope that contains the inner
scope **Outer scope**. The name defined in the outer scope can be
accessed in all inner scopes. The inner scope can also rebind the name
already defined in the outer scope. The previous example using ``var``
to define variables describes this scenario.

Variable Life Cycle
^^^^^^^^^^^^^^^^^^^

There is no concept of variable names when the program is running, and
variables exist in the form of entities at this time. The “validity
period” of a variable during program execution is the variable’s **Life
cycle**. Variables at runtime are only valid within the scope. After
leaving the scope, the variables will be destroyed to reclaim resources.

Variables defined in the global scope are called **Global variable** and
have **Static life cycle**. Such variables can be accessed during the
entire program running and will not be destroyed. Variables defined in
the local scope are called **Local variable** and have **Dynamic life
cycle**. Such variables cannot be accessed after leaving the scope and
will be destroyed.

Due to the different life cycles, local variables and global variables
use different ways to allocate storage space. Local variables are
allocated on a structure called **Stack** (stack), and objects allocated
based on the stack can be quickly reclaimed at the end of the scope.
Global variables are allocated in **Global table** (global table).
Objects in the global table will not be recycled once they are created,
and the table can be accessed anywhere in the program.

2.2.3 Type of variable
~~~~~~~~~~~~~~~~~~~~~~

Berry determines the type of the variable at runtime. In other words,
the variable can store any type of value. Therefore Berry is a **Dynamic
typing** language. The interpreter does not deduce the type of the
variable at compile time, which may cause some errors to be exposed at
runtime. For example, the error generated by executing the expression
``’1’ + 1`` is a runtime error rather than a compiler error. The
advantage of using dynamic types is that many designs can be simplified,
and the program will be more flexible, not to mention the need to design
a complex type inference system.

Due to the lack of type checking by the interpreter, user code may need
to determine the type of value by itself, and this feature can also be
used to implement some special operations. This feature also makes
overloaded functions unnecessary. For example, the native function
``type`` accepts any type of parameter and returns a string describing
the parameter type.