Legacy Lobotomy — Improving Validation in the Profile Management API

The photo was generated by the author using ChatGPT-4.

This is the 9th tutorial in the series about refactoring a legacy Django project, where we will discuss improving validation in the Profile Management API. This continues the refactoring started in the previous tutorial, where we fixed significant vulnerabilities in the profile management functionality.

To get more ideas regarding the project used for this tutorial or check other tutorials in this series, check the introductory article published earlier.

Legacy Lobotomy — Confident Refactoring of a Django Project

Origin branch and destination branch

If you want to follow the steps described in this tutorial, you should start from the profile-management-api-fixing-vulnerabilities branch.If you want to only see final code state after all changes from this tutorial are applied, you can check the profile-management-api-improving-validation branch.

Fixing problems with validation

Before we dive into fixing problems with validation, we need to clearly understand how validation should work and what rules we want our API to follow. I had no ideas about this topic, so I checked the code of the mobile app which was the only consumer of this API. Here is the code responsible for validation:

const checkTextInput = () => {
if (!profileInformation.first_name && profileInformation.first_name.trim().length === 0) {
Alert.alert(‘Error’, ‘Please enter First Name’);
}
else if (!profileInformation.last_name && profileInformation.last_name.trim().length === 0) {
Alert.alert(‘Error’, ‘Please enter last name’)
}
else if (!profileInformation.gender) {
Alert.alert(‘Error’, ‘Please select your gender’)
}
else if (!profileInformation.guardian_email) {
Alert.alert(‘Error’, ‘Please enter your guardian email’)
}
else if (!profileInformation.age) {
Alert.alert(‘Error’, ‘Please enter your age’)
}
else if (!validator.isEmail(profileInformation.guardian_email)) {
Alert.alert(‘Error’, ‘Please enter valid email’)
}
else if (!profileInformation.activity) {
Alert.alert(‘Error’, ‘Please choose your activity’)
}
// proceed when entered data is valid
}

Based on this code, the validation is the following:

first_name field is required and cannot be blanklast_name field is required and cannot be blankgender field is requiredguardian_email field is required and should be a correct email addressage field is requiredactivity field is required

The backend currently doesn’t have these validation rules in place, however, it’s important that the gender and activity fields only accept values from a predefined list, while the age field should only allow integer values between 13 and 99, inclusive. Also, the first_name and last_name can consist of at most 150 characters (such rules are defined in the AbstractUser class provided by Django and which is inherited by our User class). It’s necessary to make sure these validation rules are functioning correctly on the backend. To ensure this, we need to implement 3 different tests:

A one which checks how the system behaves when request doesn’t contain a required fieldA one which checks how the system behaves when one of the fields in the request has an invalid valueA one which checks how the system behaves when a valid request sent

We could combine tests #1 and #2 into a single test with different parameters, but in this case we would need to add conditional logic into the test to remove a specific field. It would make a test overcomplicated and harder to read. So I prefer to split into 2 tests with simpler parametrization.

Here is the first test which sends a request without one required field:

class TestProfileManagement:
# ———- Other methods are omitted for simplicity ———— #
@pytest.mark.django_db
@pytest.mark.parametrize(‘field_name’, [
‘first_name’, ‘last_name’, ‘gender’, ‘guardian_email’, ‘age’, ‘activity’
])
def test_put_method_when_required_field_is_not_passed_then_400_bad_request_status_should_be_returned(
self, field_name, user, auth_client
):
payload = self.update_user_payload(user)
payload.pop(field_name)

response = auth_client.put(self.user_details_url(user), payload)

assert response.status_code == status.HTTP_400_BAD_REQUEST
assert response.json() == {field_name: [‘This field is required.’]}

Another test which sends invalid values for one of the fields at each run:

class TestProfileManagement:
# ———- Other methods are omitted for simplicity ———— #
@pytest.mark.django_db
@pytest.mark.parametrize(‘field_name, value, expected_error’, [
pytest.param(‘first_name’, None, ‘This field may not be null.’, id=’first_name is null’),
pytest.param(‘first_name’, ”, ‘This field may not be blank.’, id=’first_name is empty’),
pytest.param(‘first_name’, ‘ ‘, ‘This field may not be blank.’, id=’first_name is blank’),
pytest.param(
‘first_name’,
‘a’ * 151, ‘Ensure this field has no more than 150 characters.’,
id=’first_name is too long’
),
pytest.param(‘last_name’, None, ‘This field may not be null.’, id=’last_name is null’),
pytest.param(‘last_name’, ”, ‘This field may not be blank.’, id=’last_name is empty’),
pytest.param(‘last_name’, ‘ ‘, ‘This field may not be blank.’, id=’last_name is blank’),
pytest.param(
‘last_name’,
‘a’ * 151, ‘Ensure this field has no more than 150 characters.’,
id=’last_name is too long’
),
pytest.param(‘gender’, None, ‘This field may not be null.’, id=’gender is null’),
pytest.param(‘gender’, ”, ‘”” is not a valid choice.’, id=’gender is empty’),
pytest.param(‘gender’, ‘ ‘, ‘” ” is not a valid choice.’, id=’gender is blank’),
pytest.param(‘gender’, ‘invalid’, ‘”invalid” is not a valid choice.’, id=’gender value is not allowed’),
pytest.param(‘guardian_email’, None, ‘This field may not be null.’, id=’guardian_email is null’),
pytest.param(‘guardian_email’, ”, ‘This field may not be blank.’, id=’guardian_email is empty’),
pytest.param(‘guardian_email’, ‘ ‘, ‘This field may not be blank.’, id=’guardian_email is blank’),
pytest.param(
‘guardian_email’, ‘invalid-email’, ‘Enter a valid email address.’, id=’guardian_email is invalid 1′
),
pytest.param(
‘guardian_email’, ‘invalid-email@’, ‘Enter a valid email address.’, id=’guardian_email is invalid 2′
),
pytest.param(
‘guardian_email’, ‘invalid-email@fake’, ‘Enter a valid email address.’, id=’guardian_email is invalid 3′
),
pytest.param(
‘guardian_email’,
’email’ + ‘a’ * (255 – len(’email@fake.com’)) + ‘@fake.com’,
‘Ensure this field has no more than 254 characters.’,
id=’guardian_email is too long’,
),
pytest.param(‘age’, None, ‘This field may not be null.’, id=’age is null’),
pytest.param(‘age’, ”, ‘A valid integer is required.’, id=’age is empty’),
pytest.param(‘age’, ‘ ‘, ‘A valid integer is required.’, id=’age is blank’),
pytest.param(‘age’, 12, ‘Ensure this value is greater than or equal to 13.’, id=’age is below minimum allowed’),
pytest.param(‘age’, 100, ‘Ensure this value is less than or equal to 99.’, id=’age exceeds maximum allowed’),
pytest.param(‘age’, 45.28, ‘A valid integer is required.’, id=’age is not an integer’),
pytest.param(‘activity’, None, ‘This field may not be null.’, id=’activity is null’),
pytest.param(‘activity’, ”, ‘”” is not a valid choice.’, id=’activity is empty’),
pytest.param(‘activity’, ‘ ‘, ‘” ” is not a valid choice.’, id=’activity is blank’),
pytest.param(‘activity’, ‘invalid’, ‘”invalid” is not a valid choice.’, id=’activity value is not allowed’),
])
def test_put_method_when_invalid_value_passed_then_400_bad_request_status_should_be_returned(
self, field_name, value, expected_error, user, auth_client
):
payload = {**self.update_user_payload(user), field_name: value}

response = auth_client.put(self.user_details_url(user), payload)

assert response.status_code == status.HTTP_400_BAD_REQUEST
assert response.json() == {field_name: [expected_error]}

And the last one which updates one field at the time and checks that this field was really changed and response contains a new value as well:

class TestProfileManagement:
# ———- Other methods are omitted for simplicity ———— #
@pytest.mark.django_db
@pytest.mark.parametrize(‘field_name, current_value, new_value’, [
pytest.param(‘first_name’, ‘John’, ‘Jack’),
pytest.param(‘last_name’, ‘Dou’, ‘Carter’),
pytest.param(‘gender’, ‘Male’, ‘Female’),
pytest.param(‘guardian_email’, ‘guardian-fake-email-old@fake.com’, ‘guardian-fake-email-new@fake.com’),
pytest.param(‘age’, 99, 13), # edge case
pytest.param(‘age’, 25, 45), # any normal value in the allowed range
pytest.param(‘age’, 13, 99), # edge case
pytest.param(‘activity’, None, ‘Law Explorers’),
])
def test_put_method_when_valid_data_passed_then_user_data_should_be_updated(
self, field_name, current_value, new_value, user_factory, auth_client_factory
):
user = user_factory(**{field_name: current_value})
payload = {**self.update_user_payload(user), field_name: new_value}

auth_client = auth_client_factory(user)
response = auth_client.put(self.user_details_url(user), payload)

user.refresh_from_db()

assert response.status_code == status.HTTP_200_OK

# verify that both response and user entry have a new value of the field
assert response.json()[field_name] == new_value
assert getattr(user, field_name) == new_value

If we run the tests, we will see that 18 of them fail because all these tests expect that the server returns 400 Bad Request HTTP error, but it returns 200 OK HTTP code which means that the request was successfully processed.

This happens because all the fields necessary for our PUT method are marked as optional in the User model, and leaving them blank is permitted. Since no extra rules have been added in the UserSerializer class, the ones defined at the model level will be applied. We can verify that by inspecting the UserSerializer this way:

Step 1. Run the following command from the root of the project to open Django shell:

python src/manage.py shell

Step 2. In the opened shell run the following commands:

from users.serializers import UserSerializerserializer = UserSerializer()print(repr(serializer))

Step 3. Check the result which should look like below:

UserSerializer():
pk = IntegerField(label=’ID’, read_only=True)
email = EmailField(read_only=True)
first_name = CharField(allow_blank=True, max_length=150, required=False)
last_name = CharField(allow_blank=True, max_length=150, required=False)
age = IntegerField(allow_null=True, max_value=99, min_value=13, required=False)
gender = ChoiceField(allow_blank=True, allow_null=True, choices=((‘Male’, ‘Male’), (‘Female’, ‘Female’), (‘Non-Binary’, ‘Non-Binary’), (‘Transgender’, ‘Transgender’), (‘Other’, ‘Other’)), required=False)
guardian_email = EmailField(allow_blank=True, allow_null=True, max_length=254, required=False)
accepted_terms_cond = BooleanField(read_only=True)
activity = ChoiceField(allow_blank=True, allow_null=True, choices=((‘Law Explorers’, ‘Law Explorers’),), required=False)
total_points = IntegerField(read_only=True)
first_login = BooleanField(read_only=True)
is_superuser = BooleanField(help_text=’Designates that this user has all permissions without explicitly assigning them.’, label=’Superuser status’, read_only=True)

As we can see, the first_name, last_name, age, gender, guardian_email and activity fields declared as optional. It’s permitted to leave the first_name, last_name, gender, guardian_email and activity fields blank. It’s also allowed to pass null value for the gender, guardian_email and activity fields.

Thus, we need to override the validation rules for these fields. We can declare all needed fields as serializer properties and specify all needed attributes for them. In this case we will need to specify all other attributes of each field including choices, length and so on. In order to override some specific attributes we are interested in, we can use the extra_kwargs property of the Meta class. This property is a dictionary with keys equal to field names and values are also dictionaries with keys equal to attribute names we want to override.

class UserSerializer(serializers.ModelSerializer):
class Meta:
# ———- Other methods and properties are omitted for simplicity ———— #
extra_kwargs = {
‘first_name’: {‘required’: True, ‘allow_blank’: False},
‘last_name’: {‘required’: True, ‘allow_blank’: False},
‘gender’: {‘required’: True, ‘allow_blank’: False, ‘allow_null’: False},
‘guardian_email’: {‘required’: True, ‘allow_blank’: False, ‘allow_null’: False},
‘age’: {‘required’: True, ‘allow_null’: False},
‘activity’: {‘required’: True, ‘allow_blank’: False, ‘allow_null’: False},
}

Here is a breakdown of all overridden attributes for 6 properties:

required specifies if it’s required for a field to be included into the request or not. If the value of this attribute is False, a field can be not included into request.allow_blank applicable for fields of string type. If its value is False, empty strings are not allowed. If the trim_whitespace is also set to True (default value), all strings consisted of only whitespaces will also not be allowed.allow_null allows to specify if a request can contain null value for the field.

If we inspect the serializer one more time (we need to close the shell, open it again and repeat steps above), we will see that now all fields have desired values of the attributes:

UserSerializer():
pk = IntegerField(label=’ID’, read_only=True)
email = EmailField(read_only=True)
first_name = CharField(allow_blank=False, max_length=150, required=True)
last_name = CharField(allow_blank=False, max_length=150, required=True)
age = IntegerField(allow_null=False, max_value=99, min_value=13, required=True)
gender = ChoiceField(allow_blank=False, allow_null=False, choices=((‘Male’, ‘Male’), (‘Female’, ‘Female’), (‘Non-Binary’, ‘Non-Binary’), (‘Transgender’, ‘Transgender’), (‘Other’, ‘Other’)), required=True)
guardian_email = EmailField(allow_blank=False, allow_null=False, max_length=254, required=True)
accepted_terms_cond = BooleanField(read_only=True)
activity = ChoiceField(allow_blank=False, allow_null=False, choices=((‘Law Explorers’, ‘Law Explorers’),), required=True)
total_points = IntegerField(read_only=True)
first_login = BooleanField(read_only=True)
is_superuser = BooleanField(help_text=’Designates that this user has all permissions without explicitly assigning them.’, label=’Superuser status’, read_only=True)

If we run tests, only these 6 should fail:

We can see that the reason is the same — we expect the server to return 400 Bad Request HTTP error, but it returns 200 OK status code when a required field is not present in the request. But we specified in the serializer that these fields are all required. What’s the problem? The evil is in details. If to be more precise, in the partial=True attribute used for instantiating the UserSerializer class in the update method of the UserViewSet class. Let’s remove this attribute at all and run tests again. Now all the tests should pass. Creating a new instance of the serializer should look like below (since we don’t need to support PATCH method, we can make our serializer work in the full update mode):

serializer = UserSerializer(self.get_object(), data=request.data)

Getting rid of the PATCH method

Now let’s make sure PATCH requests are not handled by our API. To do this, we have to slightly modify our UserViewSet class:

class UserViewSet(viewsets.ModelViewSet):
# ———- Other properties are omitted for simplicity ———— #
http_method_names = [‘get’, ‘put’, ‘delete’]

The http_method_names property allows us to define the HTTP methods our viewset should support. Let’s run the tests, and we will see that the test test_authenticated_user_cannot_make_another_user_a_superuser has failed. This happened because the test expected the server to return a 404 Not Found HTTP error, but it returned a 405 Method Not Allowed HTTP error instead. Now, we need to add another test to ensure that the PATCH method is not handled and that one user cannot update the profile of another user.

Step 1. Add the following test right below test_authenticated_user_cannot_make_another_user_a_superuser to verify that the PATCH request isn’t allowed.

class TestProfileManagement:
# ———- Other methods are omitted for simplicity ———— #
@pytest.mark.django_db
def test_patch_request_should_not_be_allowed(self, auth_client, user):
response = auth_client.patch(f’/api/users/{user.pk}/’, {‘first_name’: ‘John’})

assert response.status_code == status.HTTP_405_METHOD_NOT_ALLOWED

Step 2. Replace the test test_authenticated_user_cannot_make_another_user_a_superuser with the updated version below. Since the PATCH method isn’t allowed anymore, I replaced the API call used in this test from PATCH to PUT.

class TestProfileManagement:
# ———- Other methods are omitted for simplicity ———— #
@pytest.mark.django_db
def test_authenticated_user_cannot_make_another_user_a_superuser(self, auth_client, user_factory):
# first_login=False is used to work around a bug with attempt to change request data
another_user = user_factory(is_superuser=False, first_login=False)
update_data = {**self.update_user_payload(another_user), ‘is_superuser’: True}
response = auth_client.put(f’/api/users/{another_user.pk}/’, update_data)

another_user.refresh_from_db()

assert response.status_code == status.HTTP_404_NOT_FOUND
assert another_user.is_superuser is False

If we run tests now, we will see they all pass.

Conclusion

We have implemented all needed tests and fixed so many issues. At this point we could stop. But I would like to show how this functionality should have been done much better and simpler with tools provided by the django-rest-auth package. This is a topic of the next tutorial.

Legacy Lobotomy — Improving Validation in the Profile Management API was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

​ Level Up Coding – Medium

about Infinite Loop Digital

We support businesses by identifying requirements and helping clients integrate AI seamlessly into their operations.

Gartner
Gartner Digital Workplace Summit Generative Al

GenAI sessions:

  • 4 Use Cases for Generative AI and ChatGPT in the Digital Workplace
  • How the Power of Generative AI Will Transform Knowledge Management
  • The Perils and Promises of Microsoft 365 Copilot
  • How to Be the Generative AI Champion Your CIO and Organization Need
  • How to Shift Organizational Culture Today to Embrace Generative AI Tomorrow
  • Mitigate the Risks of Generative AI by Enhancing Your Information Governance
  • Cultivate Essential Skills for Collaborating With Artificial Intelligence
  • Ask the Expert: Microsoft 365 Copilot
  • Generative AI Across Digital Workplace Markets
10 – 11 June 2024

London, U.K.