Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added []api.Fields Generation From Go Struct #183

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DawnKosmos
Copy link

This PR introduces a new feature for generating Typesense field schemas from Go structs. By analyzing struct tags, this feature automatically constructs a schema that matches the structure and metadata specified in the Go code. This streamlines the process of setting up Typesense collections by reducing the need for manually defining schemas.

Key Features

  • Automatic Schema Generation: Converts Go struct fields and their metadata into a Typesense field schema.
  • Support for Various Tags: Recognizes tags for indexing, sorting, faceting, optional fields, and joins. This allows for customizable schema generation based on the struct's tags.
  • Embedded Structs Support: Handles embedded structs correctly, integrating their fields into the parent schema.
  • Supported Types: Covers a wide range of types, including basic types (string, int, bool, etc.), arrays, pointers and nested objects.

Usage

type MyStruct struct {
	Id     string `json:"id,index"`
	Name   string `json:"name"`
	Age    int    `json:"age,facet"`
	Email  string `json:"email,optional"`
	UserId string `json:"user_id,index,join:user.id"`
}

fields, err := ToFields(MyStruct{})

PR Checklist

@forcemeter
Copy link

nice job!

This implementation is better and non-intrusive.

@kishorenc
Copy link
Member

Thank you, we will be reviewing the PR over the next week!

@phiHero
Copy link
Contributor

phiHero commented Sep 13, 2024

Thanks @DawnKosmos, the PR looks good.

It seems like we can not set a boolean property to false. Some properties are enabled by Typesense by default (e.g. index).

How about we use separate tag for separate property instead of putting them all in one json tag?

type MyStruct struct {
	Name   string `json:"id" index:"true" facet:"true"`
	UserId string `json:"user_id" join:"user.id"`
}

This would also get rid of unknown JSON option "index" (SA5008)go-staticcheck error.

@DawnKosmos
Copy link
Author

Thanks @DawnKosmos, the PR looks good.

It seems like we can not set a boolean property to false. Some properties are enabled by Typesense by default (e.g. index).

How about we use separate tag for separate property instead of putting them all in one json tag?

type MyStruct struct {
	Name   string `json:"id" index:"true" facet:"true"`
	UserId string `json:"user_id" join:"user.id"`
}

This would also get rid of unknown JSON option "index" (SA5008)go-staticcheck error.

Yes doable, when I have time over the week I change it like this.

@phiHero
Copy link
Contributor

phiHero commented Sep 17, 2024

@DawnKosmos Also, could you remove and add that .DS_Store file to .gitignore? You can run this command if it's still being tracked by git.

git rm --cached .DS_Store

@MarcMeszaros
Copy link
Contributor

MarcMeszaros commented Oct 16, 2024

@DawnKosmos coincidentally we had implemented something similar internally at our company. I could contribute some pieces from our implementation if you are willing to accept them.
Below are some other ideas to consider based on our internal implementation:

Some of our requirements are/were:

  1. Try and reuse existing structs without having to convert to a different struct (the explicit search tag instead of assuming json is safe to use)
  2. Usage of complex stucts doesn't require explicitly defining the typesense equivalent data primitive in struct tags (a problem we run into with other libraries like bun)

1. Reuse existing structs

This is potentially the controversial one. If you don't want to repeat yourself and have various funcs to convert/marshal from one representation then this should be self explanitory.
If your perspective is, structs shouldn't mix different storage concerns and each storage representation struct should have a different representation, then I guess you can ignore the rest of this comment.

To avoid conflicts with other libraries (ORMs, validation, openapi, etc), we decided to use a search struct instead of assuming that the json tag is also the same name/should control the search representation.
In our case we have several tags on our structs for various go libraries.

Example struct:

type Project struct {
	ID             uint64          `typesense:"id" json:"id"`
	CreatedAt      time.Time       `typesense:"created_at,sort" json:"created_at"`
	UserID         uint64          `typesense:"user_id,omitempty" json:"user_id,omitempty"`
	Public         bool            `typesense:"public,facet" json:"-"` // don't return in json, but include in search index
	Labels         identifier.Tags `typesense:"labels,type=[]string" json:"labels"` // explicitly set search type
	Screenshot     *storage.File   `typesense:"screenshot,omitempty" json:"screenshot,omitempty"` // implict typesense type via interface
}

2. Using an interface

We knew our codebase had more complex structs that should be indexed and didn't map directly to the basic typesense primitives. Being the struct authors, we know how the representation should be. In the example above,
the storage.File type doesn't need a search:"screenshot,type:string" (although you could certainly override the type in our implementation). We use storage.File a lot in our codebase. Avoiding the search:",type:string" can
reduce typos from developers on the team and not requiring codebase search and replace if the representation is ever changed.

Interface

We created a simple interface that structs can implement and wrapped the typesense client to check if the interface was implemented when generating the schema. If it is, we use use the
funcs to get the data we needed for the collection representation in typesense as well (outside the scope of this PR).

interface.go

type Marshaler interface {
	SearchFieldType() string
	MarshalSearch() (any, error) // outside scope of this PR (included for completeness)
}

type Unmarshaler interface {
	UnmarshalSearch(bytes []byte) error // outside scope of this PR (included for completeness)
}

storage.File example implementing the Searchable interface we created

// file struct and other receiver funcs omitted

func (f File) SearchFieldType() string {
	return "string"
}

func (f File) MarshalSearch() (any, error) {
	// update the file with a CDN accessible public URL that should be added to search index
	downloadUrl, err := f.GenerageDownloadURL()
        if err != nil {
		return "", err
	}

	// return the public download URL for search index marshaling
	return downloadURL, nil
}

In the examples anywhere we use identifier.Tags or storage.File, we don't have to specify the typesense type in the struct tag. It is taken care of by the
interface when we update the schema and uses the SearchFieldType() to know what to use in the collection fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants