Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[10.x] Real-Time Model Factories #47849

Closed
wants to merge 29 commits into from
Closed

Conversation

joedixon
Copy link
Contributor

@joedixon joedixon commented Jul 27, 2023

Often times when I’m cranking on tests I get frustrated by the need to stop what I’m doing to make a factory for my model. This is especially true when I don’t really care what the model is populated with and I just want to test a relation or prove the correct models are being returned from an endpoint.

I was talking with @jbrooksuk about this en route to Laracon and so we started to explore the idea on the flight where we managed to prove the concept of what we're calling “Real-Time Model Factories” which we believe improves the DX of model factories.

With the addition of this PR, when calling the factory method of a model, Laravel will first check to see whether a model exists and, if not, the table definition will be generated automatically using a combination of the following techniques.

First, it will attempt to guess the required data using the column name. For example, given a column called email or email_address, the real-time factory will automatically populate an email address.

If the column can’t be guessed, it will attempt to see whether the model is using a cast and, if so, will attempt to use it. This works for all supported built-in data types. For example, if the column is using the collection type, the column in the factory definition will return a collection. It doesn't support all custom casts as it’s not possible to know the required data format, but it does support enums and enum collections where the value is selected at random from the defined options.

Finally, if the value is still not populated, the column type provided by DBAL is used to infer a value. At this point, if the column is a primary or foreign key or if the column is nullable, the column in the factory definition will resolve null. In all other instances, a sensible value derived from the column type is used.

Under the hood, the values are generated using faker.

Given the following table and model definitions:

Schema::table('users', function (Blueprint $table) {
    $table->increments('id');
    $table->string('email');
    $table->integer('age');
    $table->enum('status', ['active', 'inactive']);
    $table->text('bio')->nullable();
    $table->timestamps();
});

class User
{
    use HasRealTimeFactory;

    protected $casts = [
        'age' => 'integer',
        'status' => Status::class
    ];

}

The properties would be assigned similar to the output below when using the real-time factory:

$user = User::factory()->create();

[
    'id' => null, // primary key so value is null
    'email' => '[email protected]', // value guessed from column name
    'age' => 37, // integer value generated from cast
    'status' => 'active', // enum value selected at random from enum cast
    'bio' => null, // nullable field set to null
    'created_at' => '1973-12-19T11:07:50.000000Z', // date time value inferred from date cast
    'updated_at' => '1992-03-09T02:40:47.000000Z', // date time value inferred from date cast
] 

Of course, it’s possible to override fields as with any other model.

$user = User::factory()->create(['email' => '[email protected]');

[
    'id' => null,
-   'email' => '[email protected]',
+   'email' => '[email protected]',
    'age' => 37,
    'status' => 'active',
    'bio' => null,
    'created_at' => '1995-08-22T14:30:24.000000Z',
    'updated_at' => '2018-04-20T21:33:42.000000Z',
] 

It’s also possible to utilize factory relationships:

$post = Post::factory()->hasComments(3)->forUser()->create();

dd($post->toArray(), $post->comments->toArray(), $post->user->toArray());

// Post
[
    'id' => 1,
    'user_id' => 1,
    'title' => 'rerum',
    'body' => 'Quis et libero non aut aut quia. Eos alias asperiores a quo totam ipsam qui. Mollitia et accusantium officiis sed occaecati qui blanditiis. Dolores id odit blanditiis sit aut.',
    'published' => true,
    'created_at' => '1987-01-24T18:34:06.000000Z',
    'updated_at' => '2010-10-14T23:48:40.000000Z',
]

// User
[
    'id' => 1,
    'name' => 'Alexanne Braun',
    'email' => '[email protected]',
    'created_at' => '1992-07-15T08:36:29.000000Z',
    'updated_at' => '1988-06-18T11:01:00.000000Z',
]

// Comments
[
    [
        'id' => 1
        'commentable_type' => 'Illuminate\Tests\Database\Post',
        'commentable_id' => 1,
        'body' => 'laborum',
        'created_at' => '1990-11-02T01:16:16.000000Z',
        'updated_at' => '1993-07-23T18:58:20.000000Z',
    ], [
        'id' => 2
        'commentable_type' => 'Illuminate\Tests\Database\Post',
        'commentable_id' => 1,
        'body' => 'minima',
        'created_at' => '2020-03-21T05:43:01.000000Z',
        'updated_at' => '2014-12-19T03:28:40.000000Z',
    ],[
        'id' => 3,
        'commentable_type' => 'Illuminate\Tests\Database\Post',
        'commentable_id' => 1,
        'body' => 'repellendus',
        'created_at' => '1974-05-02T14:50:21.000000Z',
        'updated_at' => '2012-08-24T11:36:09.000000Z',
    ]
]

In the above example, none of the models used a physical factory - they were all generate in real-time, so you can see it’s very easy to build up models without needing to define values until it’s truly necessary.

Of course, when that time comes, simply define a factory for the model and you are back in control of how the model should be generated.


A note on enum columns

It’s not possible to obtain the allowed values of an enum column with the doctrine/dbal package. When using an enum cast or when the column is nullable, this is not an issue. However, in all other cases, a random string is used which will error as an invalid value when the query is executed.

*/
protected static function newFactory()
{
return (new RealTimeFactory)
Copy link
Contributor

@cosmastech cosmastech Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you build RealTimeFactory from the container? I can already see wanting to add options to the guessable values. Another option is to create a static method to allow us to set those in user-land, but I think that's much less robust.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, consider using an actual factory, or passing the individual column values in the create method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is easy enough to just create a new application-level trait for HasRealTimeFactory and also extend the RealTimeFactory class to add to the list.

I would at least suggest providing the ability to add to this list in userland, lest GitHub be inundated with weekly PRs to add "family_name", "telephone_number", etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll leave this for Taylor to look at.

@cosmastech
Copy link
Contributor

🔥 Love this

@joedixon joedixon marked this pull request as ready for review July 27, 2023 15:28
@utsavsomaiya
Copy link
Contributor

What a thinking brother...😘

@jasonmccreary
Copy link
Contributor

jasonmccreary commented Jul 27, 2023

This is super cool. I'm going to pitch something as your fellow bourbon drinker…

I feel the DX for this would be even awesomer if you didn't need to specify the trait. Having to jump into the model and add a trait doesn't feel "real-time" to me. At least not in the same sense as the "real-time" facades (only adding a namespace prefix). With the trait, this feels more like a "guessable factory".

I don't know the implications of this, but I wonder if the default for all models is a factory method which returns RealTimeFactory. That would avoid explicitly adding a trait and provide the developer immediate ability to use factories within their apps/tests. 🔥

@JayBizzle
Copy link
Contributor

This is super cool. I'm going to pitch something as your fellow bourbon drinker…

I feel the DX for this would be even awesomer if you didn't need to specify the trait. Having to jump into the model and add a trait doesn't feel "real-time" to me. At least not in the same sense as the "real-time" facades (only adding a namespace prefix). With the trait, this feels more like a "guessable factory".

I don't know the implications of this, but I wonder if the default for all models is a factory method which returns RealTimeFactory. That would avoid explicitly adding a trait and provide the developer immediate ability to use factories within their apps/tests. 🔥

Totally agree with this. Some great work in this PR but it just needs that extra Laravel magic 👍🏻

@Rizky92
Copy link

Rizky92 commented Jul 28, 2023

I support this, but can we have it in HasFactory trait instead? So instead of having the option to use 2 traits, we can use current HasFactory. If we need more control of it, we only need to define it in model's factory class without losing the "guessing" feature.

@heychazza
Copy link

Yes yes yes! I really like this, defo agree with the above that it would be nice if it automatically did it, aside from that well done folks

@jbrooksuk jbrooksuk force-pushed the feat/real-time-model-factories branch from a404db8 to 0808926 Compare July 28, 2023 09:51
@jbrooksuk
Copy link
Member

jbrooksuk commented Jul 28, 2023

@joedixon and I have addressed the "real-time" aspect of this feature. You can now simply use HasFactory, and if the class doesn't exist, Laravel will now switch to a real-time factory.

Of course, if you need any customization or additional states, you should generate a factory (e.g. php artisan make:factory PostFactory) and it will be switched automatically.

@taylorotwell if you'd prefer to keep the feature opt-in, you can revert the last commit.

@joedixon
Copy link
Contributor Author

@taylorotwell, if this were to be included in the framework, there is the potential to remove the UserFactory from the skeleton as part of your slimming exercise for v11.

We would have to add password as a guessableValue and would lose the unverified state, but perhaps some food for thought.

@jasonmccreary
Copy link
Contributor

@joedixon, nice work. Not sure how deep down the rabbit hole you want to go, but I've had a "guesser" built into Blueprint for a while. Might want to pull a few of the name/type mappings.

@utsavsomaiya
Copy link
Contributor

image

Without this package it can not work right?

So it would be use internally or we need to install by default in skeleton?

@jbrooksuk
Copy link
Member

image

Without this package it can not work right?

So it would be use internally or we need to install by default in skeleton?

You’d need to install dbal as a dev dependency.

@hafezdivandari
Copy link
Contributor

hafezdivandari commented Jul 30, 2023

Isn't there any better solution other than using doctrine/dbal? Like PR #48357

@jbrooksuk
Copy link
Member

Isn't there any better solution other than using doctrine/dbal? Like PR #45598

At the moment, this is the only way to reliably fetch this information.

@joshbonnick
Copy link
Contributor

Couple of suggestions for this feature:

Add a check if the method exists in faker, if so just call it. This removes having to map an email column explicitly to the email faker method.

There is the side effect that is now uses email instead of safeEmail but another check can be added for that specific scenario.

Guessing this way also allows the factory to use external providers added to ones project, e.g. productName()

/**
 * Guess the value of a column based on its name.
 */
protected function guessValue(string $column): mixed
{
    try {
        return fake()->{str($column)->camel()}; // convert to camel case for columns like phone_number
    } catch (InvalidArgumentException $e) {
        // faker method doesn't exist
    }

    $guessable = $this->guessableValues();
    return isset($guessable[$column]) ? $guessable[$column]() : null;
}

Use a match statement to return the faker method result. This will reduce duplication of callback functions and remove the need to check for a key in the guessable array.

Resulting factory would look something like this:

/**
 * Guess the value of a column based on its name.
 */
protected function guessValue(string $column): mixed
{
    try {
        return fake()->{str($column)->camel()}; // convert to camel case for columns like phone_number
    } catch (InvalidArgumentException $e) {
    }

    return $this->matchColumnToFakerMethod($column);
}

/**
 * Match column names with faker method
 */
protected function matchColumnToFakerMethod(string $column): mixed
{
    return match ($column) {
        'email', 'e_mail', 'email_address' => fake()->safeEmail(),
        'login', 'username' => fake()->userName(),
        // ...rest of columns
        default => null,
    };
}

@taylorotwell taylorotwell marked this pull request as draft August 21, 2023 21:18
Comment on lines +133 to +135
$columns = $this->schema->listTableColumns($this->table);

return collect($columns)->keyBy(fn ($column) => $column->getName());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems possible to do without the $columns variable like in isForeignKey() or isPrimaryKey() in this class below:

return collect($this->schema->listTableColumns($this->table))
       ->keyBy(fn ($column) => $column->getName());

@hafezdivandari
Copy link
Contributor

You may use new native Schema methods on this instead of doctrine/dbal:

Hopefully going to remove doctrine/dbal on #48864

@hafezdivandari
Copy link
Contributor

hafezdivandari commented Jan 20, 2024

@jbrooksuk @joedixon I can send a PR on top of this one to use schema methods instead of doctrine dbal if you want, but it would be much easier if you target master instead of 10.x here.

@joedixon joedixon closed this Mar 12, 2024
@joedixon joedixon deleted the feat/real-time-model-factories branch March 12, 2024 14:46
@damms005
Copy link

@joedixon I think it will be helpful if you can kindly give a hint on why you closed this. I mean it seems like a very useful feature and already has very good contributions from @jasonmccreary and @jbrooksuk, among others.

Thank you 🙏

@joedixon
Copy link
Contributor Author

@damms005 cool idea for sure, but not something for the framework at this time. We are exploring ideas to package this up.

@damms005
Copy link

Nice! Thanks for all you do 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.