Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add str() builtin function #1

Open
certik opened this issue Dec 29, 2021 · 10 comments
Open

Add str() builtin function #1

certik opened this issue Dec 29, 2021 · 10 comments

Comments

@certik
Copy link
Contributor

certik commented Dec 29, 2021

Once LPython can use ASR's generic function facilities, it should just be a generic function. In the meantime, let's implement:

  • str_int(x) ... converts an integer to a string
  • str_float(x) ... converts a floating point to a string

These will be implemented in pure Python, such as:

def str_int(x):
    ....
    # s = ...
    return s

For now, let's put this into one file, and print the result at the end. Some things will not be implemented in LPython yet, so let's open up issues for things that are not implemented yet, and we'll fix them.

@Smit-create
Copy link
Collaborator

I tried implementing str_int(x) using different ways:

  1. Using x: i32
Reference code

def str_int(x: i32) -> str:
     if x == 0:
         return '0'
     result: str
     result = ''
     if x < 0:
         result += '-'
         x = -x
     rev_result: str
     rev_result = ''
     pos_to_str: list
     pos_to_str = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
     while x > 0:
         rev_result += pos_to_str[x%10]
         x = x//10
     for pos in range(len(rev_result) - 1, -1, -1):
         result += rev_result[pos]
     return result

AST

$ lfortran --show-python-ast --indent ser.txt
(Module [
   (FunctionDef 
      str_int 
      ([] [
         (
            x 
            (Name 
               i32 
               Load) ())] [] [] [] [] []) [
      (If 
         (Compare 
            (Name 
               x 
               Load) 
            Eq [
            (ConstantInt 0 ())]) [
         (Return 
            (ConstantStr "0" ()))] []) 
      (AnnAssign 
         (Name 
            result 
            Store) 
         (Name 
            str 
            Load) () 1) 
      (Assign [
         (Name 
            result 
            Store)] 
         (ConstantStr "" ()) ()) 
      (If 
         (Compare 
            (Name 
               x 
               Load) 
            Lt [
            (ConstantInt 0 ())]) [
         (AugAssign 
            (Name 
               result 
               Store) 
            Add 
            (ConstantStr "-" ())) 
         (Assign [
            (Name 
               x 
               Store)] 
            (UnaryOp 
               USub 
               (Name 
                  x 
                  Load)) ())] []) 
      (AnnAssign 
         (Name 
            rev_result 
            Store) 
         (Name 
            str 
            Load) () 1) 
      (Assign [
         (Name 
            rev_result 
            Store)] 
         (ConstantStr "" ()) ()) 
      (AnnAssign 
         (Name 
            pos_to_str 
            Store) 
         (Name 
            list 
            Load) () 1) 
      (Assign [
         (Name 
            pos_to_str 
            Store)] 
         (List [
            (ConstantStr "0" ()) 
            (ConstantStr "1" ()) 
            (ConstantStr "2" ()) 
            (ConstantStr "3" ()) 
            (ConstantStr "4" ()) 
            (ConstantStr "5" ()) 
            (ConstantStr "6" ()) 
            (ConstantStr "7" ()) 
            (ConstantStr "8" ()) 
            (ConstantStr "9" ())] 
            Load) ()) 
      (While 
         (Compare 
            (Name 
               x 
               Load) 
            Gt [
            (ConstantInt 0 ())]) [
         (AugAssign 
            (Name 
               rev_result 
               Store) 
            Add 
            (Subscript 
               (Name 
                  pos_to_str 
                  Load) 
               (BinOp 
                  (Name 
                     x 
                     Load) 
                  Mod 
                  (ConstantInt 10 ())) 
               Load)) 
         (Assign [
            (Name 
               x 
               Store)] 
            (BinOp 
               (Name 
                  x 
                  Load) 
               FloorDiv 
               (ConstantInt 10 ())) ())] []) 
      (For 
         (Name 
            pos 
            Store) 
         (Call 
            (Name 
               range 
               Load) [
            (BinOp 
               (Call 
                  (Name 
                     len 
                     Load) [
                  (Name 
                     rev_result 
                     Load)] []) 
               Sub 
               (ConstantInt 1 ())) 
            (UnaryOp 
               USub 
               (ConstantInt 1 ())) 
            (UnaryOp 
               USub 
               (ConstantInt 1 ()))] []) [
         (AugAssign 
            (Name 
               result 
               Store) 
            Add 
            (Subscript 
               (Name 
                  rev_result 
                  Load) 
               (Name 
                  pos 
                  Load) 
               Load))] [] ()) 
      (Return 
         (Name 
            result 
            Load))] [] 
      (Name 
         str 
         Load) ())] [])

ASR

Traceback (most recent call last):
  Binary file "/home/admin-pc/Smitlunagariya/lfortran/inst/bin/lfortran", in _start()
  File "/build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c", line 310, in __libc_start_main()
  File "/home/admin-pc/Smitlunagariya/lfortran/src/bin/lfortran.cpp", line 1504, in main()
    with_intrinsic_modules, compiler_options);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/bin/lfortran.cpp", line 616, in emit_python_asr()
    r = LFortran::Python::python_ast_to_asr(al, *ast, diagnostics);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 660, in LFortran::Python::python_ast_to_asr(Allocator&, LFortran::Python::AST::ast_t&, LFortran::diag::Diagnostics&)
    auto res2 = body_visitor(al, *ast_m, diagnostics, unit);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 616, in LFortran::Python::body_visitor(Allocator&, LFortran::Python::AST::Module_t&, LFortran::diag::Diagnostics&, LFortran::ASR::asr_t*)
    b.visit_Module(ast);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 299, in LFortran::Python::BodyVisitor::visit_Module(LFortran::Python::AST::Module_t const&)
    visit_stmt(*x.m_body[i]);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1805, in LFortran::Python::AST::BaseVisitor<LFortran::Python::BodyVisitor>::visit_stmt(LFortran::Python::AST::stmt_t const&)
    void visit_stmt(const stmt_t &b) { visit_stmt_t(b, self()); }
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1677, in visit_stmt_t<LFortran::Python::BodyVisitor>()
    case stmtType::FunctionDef: { v.visit_FunctionDef((const FunctionDef_t &)x); return; }
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 312, in LFortran::Python::BodyVisitor::visit_FunctionDef(LFortran::Python::AST::FunctionDef_t const&)
    transform_stmts(body, x.n_body, x.m_body);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 283, in LFortran::Python::BodyVisitor::transform_stmts(LFortran::Vec<LFortran::ASR::stmt_t*>&, unsigned long, LFortran::Python::AST::stmt_t**)
    this->visit_stmt(*m_body[i]);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1805, in LFortran::Python::AST::BaseVisitor<LFortran::Python::BodyVisitor>::visit_stmt(LFortran::Python::AST::stmt_t const&)
    void visit_stmt(const stmt_t &b) { visit_stmt_t(b, self()); }
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1688, in visit_stmt_t<LFortran::Python::BodyVisitor>()
    case stmtType::If: { v.visit_If((const If_t &)x); return; }
LFortranException: visit_If() not implemented

  1. Using str_int(x: int)
Reference code

    def str_int(x: int) -> str:
        if x == 0:
            return '0'
        result: str
        result = ''
        if x < 0:
            result += '-'
            x = -x
        rev_result: str
        rev_result = ''
        pos_to_str: list
        pos_to_str = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
        while x > 0:
            rev_result += pos_to_str[x%10]
            x = x//10
        for pos in range(len(rev_result) - 1, -1, -1):
            result += rev_result[pos]
        return result

AST

(Module [
   (FunctionDef 
      str_int 
      ([] [
         (
            x 
            (Name 
               int 
               Load) ())] [] [] [] [] []) [
      (If 
         (Compare 
            (Name 
               x 
               Load) 
            Eq [
            (ConstantInt 0 ())]) [
         (Return 
            (ConstantStr "0" ()))] []) 
      (AnnAssign 
         (Name 
            result 
            Store) 
         (Name 
            str 
            Load) () 1) 
      (Assign [
         (Name 
            result 
            Store)] 
         (ConstantStr "" ()) ()) 
      (If 
         (Compare 
            (Name 
               x 
               Load) 
            Lt [
            (ConstantInt 0 ())]) [
         (AugAssign 
            (Name 
               result 
               Store) 
            Add 
            (ConstantStr "-" ())) 
         (Assign [
            (Name 
               x 
               Store)] 
            (UnaryOp 
               USub 
               (Name 
                  x 
                  Load)) ())] []) 
      (AnnAssign 
         (Name 
            rev_result 
            Store) 
         (Name 
            str 
            Load) () 1) 
      (Assign [
         (Name 
            rev_result 
            Store)] 
         (ConstantStr "" ()) ()) 
      (AnnAssign 
         (Name 
            pos_to_str 
            Store) 
         (Name 
            list 
            Load) () 1) 
      (Assign [
         (Name 
            pos_to_str 
            Store)] 
         (List [
            (ConstantStr "0" ()) 
            (ConstantStr "1" ()) 
            (ConstantStr "2" ()) 
            (ConstantStr "3" ()) 
            (ConstantStr "4" ()) 
            (ConstantStr "5" ()) 
            (ConstantStr "6" ()) 
            (ConstantStr "7" ()) 
            (ConstantStr "8" ()) 
            (ConstantStr "9" ())] 
            Load) ()) 
      (While 
         (Compare 
            (Name 
               x 
               Load) 
            Gt [
            (ConstantInt 0 ())]) [
         (AugAssign 
            (Name 
               rev_result 
               Store) 
            Add 
            (Subscript 
               (Name 
                  pos_to_str 
                  Load) 
               (BinOp 
                  (Name 
                     x 
                     Load) 
                  Mod 
                  (ConstantInt 10 ())) 
               Load)) 
         (Assign [
            (Name 
               x 
               Store)] 
            (BinOp 
               (Name 
                  x 
                  Load) 
               FloorDiv 
               (ConstantInt 10 ())) ())] []) 
      (For 
         (Name 
            pos 
            Store) 
         (Call 
            (Name 
               range 
               Load) [
            (BinOp 
               (Call 
                  (Name 
                     len 
                     Load) [
                  (Name 
                     rev_result 
                     Load)] []) 
               Sub 
               (ConstantInt 1 ())) 
            (UnaryOp 
               USub 
               (ConstantInt 1 ())) 
            (UnaryOp 
               USub 
               (ConstantInt 1 ()))] []) [
         (AugAssign 
            (Name 
               result 
               Store) 
            Add 
            (Subscript 
               (Name 
                  rev_result 
                  Load) 
               (Name 
                  pos 
                  Load) 
               Load))] [] ()) 
      (Return 
         (Name 
            result 
            Load))] [] 
      (Name 
         str 
         Load) ())] [])

ASR

semantic error: Annotation type not supported
 --> ser.txt:1:1
  |
1 | 0 0 1 0 7 str_int 0 1 1 x 1 26 3 int 0 0 0 0 0 0 0 11 11 15 26 1 x 0 0 1 20 0 0 1 3 1 19 1 0 0 0 7 26 6 result 1 26 3 str 0 0 1 5 1 26 6 result 1 19 0  0 0 11 15 26 1 x 0 2 1 20 0 0 2 6 26 6 result 1 0 19 1 - 0 5 1 26 1 x 1 3 3 26 1 x 0 0 0 7 26 10 rev_result 1 26 3 str 0 0 1 5 1 26 10 rev_result 1 19 0  0 0 7 26 10 pos_to_str 1 26 4 list 0 0 1 5 1 26 10 pos_to_str 1 27 10 19 1 0 0 19 1 1 0 19 1 2 0 19 1 3 0 19 1 4 0 19 1 5 0 19 1 6 0 19 1 7 0 19 1 8 0 19 1 9 0 0 0 10 15 26 1 x 0 4 1 20 0 0 2 6 26 10 rev_result 1 0 24 26 10 pos_to_str 0 2 26 1 x 0 5 20 10 0 0 5 1 26 1 x 1 2 26 1 x 0 12 20 10 0 0 0 8 26 3 pos 1 16 26 5 range 0 3 2 16 26 3 len 0 1 26 10 rev_result 0 0 1 20 1 0 3 3 20 1 0 3 3 20 1 0 0 1 6 26 6 result 1 0 24 26 10 rev_result 0 26 3 pos 0 0 0 0 3 1 26 6 result 0 0 1 26 3 str 0 0 0 
  | ^ 


Note: if any of the above error or warning messages are not clear or are lacking
context please report it to us (we consider that a bug that needs to be fixed).

  1. Using the magic method (__str__)
Reference code

def str_int(x: i32) -> str:
    return x.__str__()

AST

(Module [
   (FunctionDef 
      str_int 
      ([] [
         (
            x 
            (Name 
               i32 
               Load) ())] [] [] [] [] []) [
      (Return 
         (Call 
            (Attribute 
               (Name 
                  x 
                  Load) 
               __str__ 
               Load) [] []))] [] 
      (Name 
         str 
         Load) ())] [])

ASR

Traceback (most recent call last):
  Binary file "/home/admin-pc/Smitlunagariya/lfortran/inst/bin/lfortran", in _start()
  File "/build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c", line 310, in __libc_start_main()
  File "/home/admin-pc/Smitlunagariya/lfortran/src/bin/lfortran.cpp", line 1504, in main()
    with_intrinsic_modules, compiler_options);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/bin/lfortran.cpp", line 616, in emit_python_asr()
    r = LFortran::Python::python_ast_to_asr(al, *ast, diagnostics);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 660, in LFortran::Python::python_ast_to_asr(Allocator&, LFortran::Python::AST::ast_t&, LFortran::diag::Diagnostics&)
    auto res2 = body_visitor(al, *ast_m, diagnostics, unit);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 616, in LFortran::Python::body_visitor(Allocator&, LFortran::Python::AST::Module_t&, LFortran::diag::Diagnostics&, LFortran::ASR::asr_t*)
    b.visit_Module(ast);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 299, in LFortran::Python::BodyVisitor::visit_Module(LFortran::Python::AST::Module_t const&)
    visit_stmt(*x.m_body[i]);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1805, in LFortran::Python::AST::BaseVisitor<LFortran::Python::BodyVisitor>::visit_stmt(LFortran::Python::AST::stmt_t const&)
    void visit_stmt(const stmt_t &b) { visit_stmt_t(b, self()); }
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1677, in visit_stmt_t<LFortran::Python::BodyVisitor>()
    case stmtType::FunctionDef: { v.visit_FunctionDef((const FunctionDef_t &)x); return; }
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 312, in LFortran::Python::BodyVisitor::visit_FunctionDef(LFortran::Python::AST::FunctionDef_t const&)
    transform_stmts(body, x.n_body, x.m_body);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/semantics/python_ast_to_asr.cpp", line 283, in LFortran::Python::BodyVisitor::transform_stmts(LFortran::Vec<LFortran::ASR::stmt_t*>&, unsigned long, LFortran::Python::AST::stmt_t**)
    this->visit_stmt(*m_body[i]);
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1805, in LFortran::Python::AST::BaseVisitor<LFortran::Python::BodyVisitor>::visit_stmt(LFortran::Python::AST::stmt_t const&)
    void visit_stmt(const stmt_t &b) { visit_stmt_t(b, self()); }
  File "/home/admin-pc/Smitlunagariya/lfortran/src/lfortran/python_ast.h", line 1680, in visit_stmt_t<LFortran::Python::BodyVisitor>()
    case stmtType::Return: { v.visit_Return((const Return_t &)x); return; }
LFortranException: visit_Return() not implemented

@certik
Copy link
Contributor Author

certik commented Dec 30, 2021

Perfect, thanks. Yes, we have to implement if and return. For pos_to_str = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], we'll probably implement it as a 1D array.

@Smit-create
Copy link
Collaborator

Well, float_to_string requires float_to_int and we may have to set the precision too. builtin magic method __str__ might be best option for str_float.

@certik
Copy link
Contributor Author

certik commented Jan 3, 2022

Let's use chr(ord('0')+i) instead of ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'][i].

@certik
Copy link
Contributor Author

certik commented Jan 3, 2022

Let's implement chr and ord like this:

# Uncomment once we implement the `sys` module
#from sys import exit

def ord(s: str) -> i32:
    if s == '0':
        return 48
    elif s == '1':
        return 49
    else:
        exit(1)

def chr(i: i32) -> str:
    if i == 48:
        return '0'
    elif i == 49:
        return '1'
    else:
        exit(1)

@certik
Copy link
Contributor Author

certik commented Jan 3, 2022

To implement some of the features that we need, isolate the given feature and try to implement it, such as:

def test1():
     result: str
     result = "a"
     result += "x"

@Shaikh-Ubaid
Copy link
Collaborator

Let's implement chr and ord like this:

# Uncomment once we implement the `sys` module
#from sys import exit

def ord(s: str) -> i32:
    if s == '0':
        return 48
    elif s == '1':
        return 49
    else:
        exit(1)

def chr(i: i32) -> str:
    if i == 48:
        return '0'
    elif i == 49:
        return '1'
    else:
        exit(1)

The ord and chr functions fail if characters other than '0', '1' for ord and unicode values other than 48, 49 for chr are passed. How are we planning to deal with this?

@Shaikh-Ubaid
Copy link
Collaborator

Shaikh-Ubaid commented Mar 29, 2022

Shall I update the functions ord and chr as follows:

def ord(s: str) -> i32: # supports characters with unicode value between 32 to 126
    if len(s) != 1:
        return "Not a Character"
    for i in range(32, 127):
        if chr(i) == s:
            return i
    

def chr(i: i32) -> str: # supports unicode values between 32 to 126
    if i < 32 or i > 126:
        return "Not Yet Supported"
    all_chars = ' !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~'
    return all_chars[i - 32]

Testing Code:

testing_chars = ' !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~'
for ch in testing_chars:
    print(f"{ch}: {ord(ch)}")

Testing Code Output:

 : 32
!: 33
": 34
#: 35
$: 36
%: 37
&: 38
': 39
(: 40
): 41
*: 42
+: 43
,: 44
-: 45
.: 46
/: 47
0: 48
1: 49
2: 50
3: 51
4: 52
5: 53
6: 54
7: 55
8: 56
9: 57
:: 58
;: 59
<: 60
=: 61
>: 62
?: 63
@: 64
A: 65
B: 66
C: 67
D: 68
E: 69
F: 70
G: 71
H: 72
I: 73
J: 74
K: 75
L: 76
M: 77
N: 78
O: 79
P: 80
Q: 81
R: 82
S: 83
T: 84
U: 85
V: 86
W: 87
X: 88
Y: 89
Z: 90
[: 91
\: 92
]: 93
^: 94
_: 95
`: 96
a: 97
b: 98
c: 99
d: 100
e: 101
f: 102
g: 103
h: 104
i: 105
j: 106
k: 107
l: 108
m: 109
n: 110
o: 111
p: 112
q: 113
r: 114
s: 115
t: 116
u: 117
v: 118
w: 119
x: 120
y: 121
z: 122
{: 123
|: 124
}: 125
~: 126

To accomplish #281 , I think that we might need ord and chr functions to implement string methods like isalpha, isupper, etc.

@certik
Copy link
Contributor Author

certik commented Apr 13, 2022

@Shaikh-Ubaid I think so. I think we might need to fix some bugs in LPython to make it work, but I think an implementation along your lines should work. Can you please submit it as a PR and add some tests?

P.S. My apologies for late answer, I just noticed your comments now.

@Shaikh-Ubaid
Copy link
Collaborator

Can you please submit it as a PR and add some tests?

Yes, I will submit the ord and chr implementations along with integrations tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants