Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WriteStop error runtime error: invalid memory address or nil pointer dereference #550

Open
Sreethecool opened this issue Aug 2, 2023 · 10 comments

Comments

@Sreethecool
Copy link

Getting error during write stop with invalid memory address or nil pointer dereference.
`package main

import (
"fmt"
"encoding/json"

"github.com/xitongsys/parquet-go-source/local"
"github.com/xitongsys/parquet-go/writer"

)

func main(){
var err error
md := { "Tag": "name=parquet-go", "Fields": [ {"Tag": "name=name, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"}, { "Tag": "name=data__CronJob, type=LIST", "Fields": [ { "Tag": "name=element, type=MAP", "Fields":[ {"Tag": "name=key, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"}, {"Tag": "name=val, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"} ] } ] }, { "Tag": "name=data__Pod, type=LIST", "Fields": [ { "Tag": "name=element, type=MAP", "Fields":[ {"Tag": "name=key, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"}, {"Tag": "name=val, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"} ] } ] }, { "Tag": "name=data__Job, type=LIST", "Fields": [ { "Tag": "name=element, type=MAP", "Fields":[ {"Tag": "name=key, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"}, {"Tag": "name=val, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"} ] } ] } ] }

//write
fw, err := local.NewLocalFileWriter("json.parquet")
if err != nil {
	fmt.Println("Can't create file", err)
	return
}
pw, err := writer.NewJSONWriter(md, fw, 4)
if err != nil {
	fmt.Println("Can't create json writer", err)
	return
}

rec := `
[
	{
	  "data__CronJob": [
		{
		  "imageID": "",
		  "name": "dlc-cronjob"
		}
	  ],
	  "data__Pod": [
		{
		  "imageID": "icr",
		  "name": "icr"
		},
		{
		  "imageID": "private",
		  "name": "private"
		}
	  ],
	  "name": "bkott3ld04me4oaak7og"
	},
	{
	  "data__CronJob": [
		{
		  "imageID": "dlc",
		  "name": "dlc-cronjob:49"
		}
	  ],
	  "data__Job": [
		{
		  "imageID": "",
		  "name": "iKS"
		}
	  ],
	  "data__Pod": [
		{
		  "imageID": "icrb",
		  "name": "icr6"
		},
		{
		  "imageID": "usf",
		  "name": "us4"
		}
	  ],
	  "name": "bl4qd8ld0gav5i7e4ekg"
	}
]`


var arr []interface{}
err = json.Unmarshal([]byte(rec),&arr)
fmt.Println(err,len(arr))
for _,v:=range arr{

	recJson,_ := json.Marshal(v)
	if err = pw.Write(string(recJson)); err != nil {
		fmt.Println("Write error", err)
	}


}


if err = pw.WriteStop(); err != nil {
	fmt.Println("WriteStop error", err)
}
fmt.Println("Write Finished")
fw.Close()

}
`

@AlexMapley
Copy link

AlexMapley commented Sep 7, 2023

I'm also seeing this, an unexpected invalid memory address or nil pointer de-reference error on WriteStop - still trying to figure out why. I'm using version 1.6.2

@robkler
Copy link

robkler commented Sep 8, 2023

I have the same issue here and use the same version, 1.6.2. Does anyone know what it could be?

@imraghav20
Copy link

I am facing the same issue v1.6.2.

@K1T3K1
Copy link

K1T3K1 commented Jan 28, 2024

Any updates on this? Facing same issue

@divyagowdab
Copy link

Facing similar issue. Any updates on this?

@carlosjpr-collab
Copy link

same thing

@shysudo
Copy link

shysudo commented Mar 14, 2024

@xitongsys Facing similar issue with writeStop function call. any update

@rarick
Copy link

rarick commented Jun 13, 2024

I was running into this and found that my problem was that Write does not accept a slice of marshallable objects like Read does. I did a bit of debugging and found that it's coming from the Flush call in WriteStop, but didn't get further than that.

I think a few improvements could be made:

  1. (Far more importantly) Improve error legibility and communication by using error wrapping.
  2. Consider accepting a slice to Write.

@mikemherron
Copy link

I'm also having this issue; in my case it is coming from common/common.go:

...
func FindFuncTable(pT *parquet.Type, cT *parquet.ConvertedType, logT *parquet.LogicalType) FuncTable {
	if cT == nil && logT == nil {
		if *pT == parquet.Type_BOOLEAN { //pT is null so panics when try to deference
			return boolFuncTable{}
...

This is my repro:

package main

import (
	"bytes"
	"github.com/xitongsys/parquet-go/parquet"
	"github.com/xitongsys/parquet-go/writer"
)

type person struct {
	Age  int    `json:"age"`
	Name string `json:"name"`
}

func main() {

	people := []*person{
		{
			Age:  10,
			Name: "bob",
		},
		{
			Age:  12,
			Name: "fred",
		},
	}

	buf := &bytes.Buffer{}
	pw, err := writer.NewParquetWriterFromWriter(buf, people[0], 1)
	if err != nil {
		panic(err)
	}

	pw.RowGroupSize = 128 * 1024 * 1024 //128M
	pw.CompressionType = parquet.CompressionCodec_SNAPPY
	for _, item := range people {
		if err = pw.Write(item); err != nil {
			panic(err)
		}
	}

        // panic occurs in WriteStop
	if err = pw.WriteStop(); err != nil {
		panic(err)
	}
}

@telnoratti
Copy link

I ran into the same issue as @mikemherron and was able to resolve it by adding parquet tags to my struct. The the repro works if you change your struct to the following.

type person struct {
    Age  int    `json:"age" parquet:"type=INT32"`
    Name string `json:"name" parquet:"type=BYTE_ARRAY"`
}

I'm guessing this isn't the same problem as the initial issue reported.

The documentation and examples use the parquet tags, so I think this is a matter of improving the error messages. I was modifying code previously and accidentally removed them through refactoring. It's easy enough to happen and I don't think it should fail when syncing the parquet file.

@xitongsys Maybe when returning from NewSchemaHandlerFromStruct you could check if len(infos) == 0 and throw an error? I can't imagine why that would be a valid state.

res := NewSchemaHandlerFromSchemaList(schemaElements)
res.Infos = infos
res.CreateInExMap()
return res, nil
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests